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(54) Nucleic acid library arrays, methods for synthesizing them and methods for sequencing and 
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(57) Methods for discriminating between fully com- 
plementary hybrids and those that differ by one or more 
base pairs and libraries of unimolecular, double- 
stranded oligonucleotides on a solid support. In one 
embodiment, the present invention provides methods of 
using nuclease treatment to improve the quality of 
hybridization signals on high density oligonucleotide 
arrays. In another embodiment, the present invention 
provides methods of using ligation reactions to improve 
the quality of hybridization signals on high density oligo- 
nucleotide arrays. In yet another embodiment, the 
present invention provides libraries of unimolecular or 
intermolecular, double-stranded oligonucleotides on a 
solid support. These libraries are useful in pharmaceu- 
tical discovery for the screening of numerous biological 
samples for specific interactions between the double- 
stranded oligonucleotides, and peptides, proteins, drugs 
and RNA. In a related aspect, the present invention pro- 
vides libraries of conformationally restricted probes on a 
solid support. The probes are restricted in their move- 
ment and flexibility using double-stranded oligonucle- 
otides as scaffolding. The probes are also useful in 
various screening procedures associated with drug dis- 
covery and diagnosis. The present invention further pro- 
vides methods for the preparation and screening of the 
above libraries. 
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Description 

CROSS-REFERENCE TO RELATED APPLICATIONS 

-misapplication is a continuation-in-part of United states Serial No. 08/327,522, filed October 21 1994 and United 
States Serial Number 08/327,687, filed October 24, 1994, each of which is incorporated by reference in its entirety for 
all purposes. ' 

GOVERNMENT RIGHTS 

Research leading to the invention was funded in part by NIH Grant No. and the Government 

may have certain rights to the invention. " . ana tne government 

BACKGROUND OF THE INVENTION 

The relationship between structure and function of macromolecules is of fundamental importance in the understand- 
ing of biological systems. Such relationships are important to understanding, for example, the functions of enzymes 
structural prote.ns, and signalling proteins, the ways in which cells communicate with one another, the mechanisms of 
cellular control and metabolic feedback, etc 

Genetic information is critical in the continuation of life processes. Life is substantially informationally based and 
genetic content controls the growth and reproduction of the organism and its complements. Proteins, which are critical 
features of all living systems, are encoded by the genetic materials of the cell. More particularly, the properties of 
enzymes, functional proteins and structural proteins are determined by the sequence of amino acids from which they 
are made. As such, it has become very important to determine the genetic sequences of nucleotides which encode the 
enzymes, structural proteins and other effectors of biological functions. In addition to the segments of nucleotides which 
encode polypeptides, there are many nucleotide sequences which are involved in the control and regulation of oene 
expression. a a 

The human genome project is an example of a project that is directed toward determining the complete sequence 
o the genome of the human organism. Although such a sequence would not necessarily correspond to the sequence 
of any specific individual, it will provide significant information as to the general organization and specific sequences 
contained within genomic segments from particular individuals. It will also provide mapping information useful for further 
detailed studies. The need for highly rapid, accurate, and inexpensive sequencing technology is nowhere more apparent 
than in a demanding sequencing project such as this. To complete the sequencing of a human genome will require the 
determination of approximately 3 x 10° or 3 billion, base pairs. 

The procedures typically used today for sequencing include the methods described in Sanger, et al Proc Natl 
Acad. So. USA 74:5463-5467 (1977). and Maxam, et al., Methods in Enzymology 65:499-559 (1980). The Sanger 
method utilizes enzymatic elongation with chain terminating dideoxy nucleotides. The Maxam and Gilbert method uses 
chemical reactions exhibiting specificity of reactants to generate nucleotide specific cleavages. Both methods however 
require a practitioner to perform a large number of complex, manual manipulations. For example, such methods usually 
require the isolation of homogeneous DNA fragments, elaborate and tedious preparation of samples, preparation of a 
separating gel, application of samples to the gel, electrophoresing the samples on the gel, working up the finished ael 
and analysis of the results of the procedure. 

Alternative techniques have been proposed for sequencing a nucleic acid. PCT patent Publication No 92/10588 
incorporated herein by reference for all purposes, describes one improved technique in which the sequence of a labeled' 
target nuclac acid is determined by hybridization to an array of nucleic acid probes on a substrate. Each probe is located 
at a positionally distinguishable location on the substrate. When the labeled target is exposed to the substrate it binds 
at locations that contain complementary nucleotide sequences. Through knowledge of the sequence of the probes at 
the binding locations, one can determine the nucleotide sequence of the target nucleic acid. The technique is particularly 
efficient when very large arrays of nucleic acid probes are utilized. Such arrays can be formed according to the techniques 
described in U.S. Patent No. 5.143.854 issued to Pirrung, et al. See also. U.S. application Serial No. 07/805 727 both 
of which are incorporated herein by reference for all purposes. 

When the nucleic acid probes are of a length shorter than the target, one can employ a reconstruction technique 
to determine the sequence of the larger target based on affinity data from the shorter probes. See U S Patent No 
5.202,231 issued to Drmanac, etal., and PCT patent Publication No. 89/10977 issued to Southern. One technique for 
overcoming this difficulty has been termed sequencing by hybridization or SBH. Assume, for example that a 12-mer 
target DNA. i.e., 5'-AGCCTAGCTGAA, is mixed with an array of all octanucleotide probes. If the target binds only to 
those probes having an exactly complementary nucleotide sequence, only five of the 65,536 octamer probes (/ e 3'- 
TCGGATCG, CGGATCGA, GGATCGAC, GATCGACT. and ATCGACTT) will hybridize to the target. Alignment of'the 
overlapping sequences from the hybridizing probes reconstructs the complement of the original 12-mer target 
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target-oligonudeotide hybrid complexes are treated with a nuclease and in t,, m * a 

complexes are washed to remove non-oerfectlv * , ' ^ m * y * ^^oligonucleotide 

, an array of 

Each of the oligonucleotides in the t^XsZTm ^t^^ 008 * the methods described her«in. 
target nuc.eicacid. the target nudeiSe^ 

necessarily labelled. After the array JSSSSSl \ n ^? me^t • 816 *" 9 * nucWc acid is not 

oligonucleotide hybrid complexes thet^SaSSlS f!^. 6 ,ar9et nudeic acid t0 form ter 3 et - 
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w BRIEF DESCRIPTION OF THE DRAWINGS 

FIG. 3 illustrates the light directed synthesis of an array of oligonucleotides on a substrate 

corresponding prates from second, third and SS^bTST' » <™> »» M probe set and 

TO !fi!S2» S ^™™^ M ^ mOTa * Ma ^»«^^l»ctesat 

reSEES^ 

TO £ fT 68 3 ' " 9 Strat69y fof ana ' yzing closi "9 «P"«* Nations P 
Fir ™ ♦ ? t'!" 9 Sfrate9y f0r aVOidin 9 loss * a '9 nal *» to probe self-annealina 

length is 15 and the Wer^^^ 

<— ^ 

FIG. 1 7A to 1 7C illustrate methods whi* can be used to prepare single-stranded nucleic acW sequences. 
« DETAILED DESCRIPTION OF THE INVENTION AND PREFERRED EMBODIMENTS 
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The following terms are intended to have the following general meanings as they are used herein: 

1. Sub^alfi: A material having a rigid or semi-rigid surface. In many embodiments, at least one surface of the 
substrate will be substantially flat, although in some embodiments it may be desirable to physically separate syn- 
thesis regions for different polymers with, for example, wells, raised regions, etched trenches, or the like In some 
embodiments, the substrate itself contains wells, trenches, flow through regions, etc. which form all or part of the 
synthesis regions. According to other embodiments, small beads may be provided on the surface, and compounds 
synthesized thereon may be released upon completion of the synthesis. 

2. Predefined Reqfon : A predefined region is a localized area on a substrate which is. was, or is intended to be used 
for formation of a selected polymer and is otherwise referred to herein in the alternative as "reaction" region a 
selected region or simply a "region." The predefined region may have any convenient shape, e.g.. circular rec- 
tangular, elliptical, wedge-shaped, etc. In some embodiments, a predefined region and. therefore, the area upon 
which each distinct polymer sequence is synthesized is smaller than about 1 cm 2 , more preferably less than 1 mm 2 
a H f LT 6 P!l ,erab| y ,ess than 0 5 mm2 ln m °st Preferred embodiments, the regions have an area less than 
about 10 000 (tm 2 or. more preferably, less than 100 urn 2 . Within these regions, the polymer synthesized therein is 
preferably synthesized in a substantially pure form. 

3^tarjM!y±uie: A polymer or other compound is considered to be "substantially pure" when it exhibits char- 
acteristics that distinguish it from the polymers or compounds in other regions. For example, purity can bemeasured 
mtermsof the activity or concentration of thecompound of interest. Preferably the compound in a region is sufficiently 
pure such that it is the predominant species inthe region. According tocertain aspectsof the invention, thecompound 
•s 5^ pure, more preferably more than 10% pure, and most preferably more than 20% pure. According to more 
preferred aspects of the invention, the compound is greater than 80% pure, preferably more than 90% pure and 
more preferably more than 95% pure, where purity for this purpose refers to the ratio of Ihe number of compound 
molecules formed in a region having a desired structure to the total number of non-solvent molecules in the region. 

as 4. Mfloomei: In general, a monomer is any member of the set of molecules which can be joined together to form 
an oligomer or polymer. The set of monomers useful in the present invention includes, but is not restricted to for 
the example of oligonucleotide synthesis, the set of nucleotides consisting of adenine, thymine, cytosine, guanine 
and undine (A, T, C, G. and U, respectively) and synthetic analogs thereof. As used herein, monomers refers to any 
member of a basis set for synthesis of an oligomer. Different basis sets of monomers may be used at successive 

■w steps in the synthesis of a polymer. 

5. Oliqpmer or Polymer : The oligomer or polymer sequences of the present invention are formed from the chemical 
or enzymatic addition of monomer subunits. Such oligomers include, for example, both linear, cyclic and branched 
polymers of nucleic acids, polysaccharides, phospholipids, and peptides having either a-. ($-, or a>-amino acids 
heteropolymers in which a known drug is covalently bound to any of the above, polyurethanes, polyesters polycar- 
bonates, polyureas. polyamides, polyethyleneimines. polyarylene sulfides, polysiloxanes, polyimides. polyacetates 
or other polymers which will be readily apparent to one skilled in the art upon review of this disclosure As used 
herein, the term oligomer or polymer is meant to include such molecules as p-turn mimetics. prostaglandins and 
benzodiazepines which can also be synthesized in a stepwise fashion on a solid support. 

6. Pgpjide: A peptide is an oligomer in which the monomers are amino acids and which are joined together through 
am.de bonds and alternatively referred to asapolypeptide. Inthecontext of this specification it should be appreciated 
that when a-am.no acids are used, they may be the L-optical isomer or the D-optcal isomer. Other amino acids 
which are useful in the present invention include unnatural amino acids such as p-alanine, phenylglycine 
homoarginine and the like. Peptides are more than two amino acid monomers long, and often more than 20 amino 
acid monomers long. Standard abbreviations for amino acids are used (e.g.. P for proline). These abbreviations are 
included in Stryer, Biochemistry. Third Ed.. (1988), which is incorporated herein by reference for all purposes 
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7. O ligonucleotides : An oligonucleotide is a single-stranded DNA or RNA mnior, .i* h^;~n 

andCarruthers, Tetrahedron Lett 22-1859-1862 , 1 ^„ r D ^ p i no ^ horam f temeth « J described by Beaucage 
Am. Chem Soc 103-3185 hnih vT S^ ] ' or by the 1r,ester me,h «« according to Matteucci. era/., J 
a«^ 

oligonucleotides are referred to as •SStoS^'Z t ^ noio ^ «*w»sed in detail below). When 

nuc^esexisttna^ 
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12. Epitope : The portion of an antigen molecule which is delineated by the area of interaction with the subclass of 
receptors known as antibodies. 

13. Identifier tag: A means whereby one can identify which molecules have experienced a particular reaction in the 
synthesis of an oligomer. The identifier tag also records the step in the synthesis series in which the molecules 
experienced that particular monomer reaction. The identifier tag may be any recognizable feature which is, for exam- 
ple: microscopically distinguishable in shape, size, color, optical density, etc.; differently absorbing or emitting of 
light; chemically reactive; magnetically or electronically encoded; or in some other way distinctively marked with the 
required information. A preferred example of such an identifier tag is an oligonucleotide sequence. 

1 4. Ligand/Probe : A ligand is a molecule that is recognized by a particular receptor. The agent bound by or reacting 
with a receptor is called a "ligand, " a term which is definitional^ meaningful only in terms of its counterpart receptor. 
The term "ligand" does not imply any particular molecular size or other structural or compositional feature other than 
that the substance in question is capable of binding or otherwise interacting with the receptor. Also, a ligand may 
serve either as the natural ligand to which the receptor binds, or as a functional analogue that may act as an agonist 
or antagonist. Examples of ligands that can be investigated by th is i nvention include, but are not restricted to, agonists 
and antagonists for cell membrane receptors, toxins and venoms, viral epitopes, hormones (e.g., opiates, steroids, 
etc.), hormone receptors, peptides, enzymes, enzyme substrates, substrate analogs, transition state analogs, cofac- 
tors, drugs, proteins, and antibodies. The term "probe" refers to those molecules which are expected to act like 
ligands but for which binding information is typically unknown. For example, if a receptor is known to bind a ligand 
which is a peptide p-turn, a "probe" or library of probes will be those molecules designed to mimic the peptide p- 
turn. In instances where the particular ligand associated with a given receptor is unknown, the term probe refers to 
those molecules designed as potential ligands for the receptor. 

15. Receptor : A molecule that has an affinity for a given ligand or probe. Receptors may be naturally-occurring or 
manmade molecules. Also, they can be employed in their unaltered natural or isolated state or as aggregates with 
other species. Receptors may be attached, covalently or noncovalently, to a binding member, either directly or via 
a specific binding substance. Examples of receptors which can be employed by this invention include, but are not 
restricted to, antibodies, cell membrane receptors, monoclonal antibodies and antisera reactive with specific anti- 
genic determinants (such as on viruses, ceils or other materials), drugs, polynucleotides, nucleic acids, peptides, 
cofactors, lectins, sugars, polysaccharides, cells, cellular membranes, and organelles. Receptors are sometimes 
referred to in the art as anti-ligands. As the term receptors is used herein, no difference in meaning is intended. A 
"ligand-receptor pair" is formed when two molecules have combined through molecular recognition to form a com- 
plex. Other examples of receptors which can be investigated by this invention include but are not restricted to: 

a) Microorganism receptors : Determination of ligands or probes that bind to receptors, such as specific transport 
proteins or enzymes essential to survival of microorganisms, is useful in a new class of antibiotics. Of particular 
value would be antibiotics against opportunistic fungi, protozoa, and those bacteria resistant to the antibiotics 
in current use. 

b) Enzymes: For instance, the binding site of enzymes such as the enzymes responsible for cleaving neuro- 
transmitters. Determination of ligands or probes that bind to certain receptors, and thus modulate the action of 
the enzymes that cleave the different neurotransmitters, is useful in the development of drugs that can be used 
in the treatment of disorders of neurotransmission. 

c) Antibodies : For instance, the invention may be useful in investigating the ligand-binding site on the antibody 
molecule which combines with the epitope of an antigen of interest. Determining a sequence that mimics an 
antigenic epitope may lead to the development of vaccines of which the immunogen is based on one or more 
of such sequences, or lead to the development of related diagnostic agents or compounds useful in therapeutic 
treatments such as for autoimmune diseases (e.g., by blocking the binding of the "self" antibodies). 

d) Nucleic Acids : The invention may be useful in investigating sequences of nucleic acids acting as binding sites 
for cellular proteins ( "trans -acting factors"). Such sequences may include, e.g., transcription factors, suppres- 
sors, enhancers or promoter sequences. 

e) Catalytic Polypeptides : Polymers, preferably polypeptides, which are capable of promoting a chemical reac- 
tion involving the conversion of one or more reactants to one or more products. Such polypeptides generally 
include a binding site specific for at least one reactant or reaction intermediate and an active functionality prox- 
imate to the binding site, which functionality is capable of chemically modifying the bound reactant. Catalytic 
polypeptides are described in, Lerner, R.A. et a/., Science 252: 659 (1991), which is incorporated herein by 
reference. 

f) Hormone receptors : For instance, the receptors for insulin and growth hormone. Determination of the ligands 
which bind with high affinity to a receptor is useful in the development of, for example, an oral replacement of 
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9) Opiatereceptors : Determination of ligands that bind to the opiate receptors in the brain is useful in »ho *«,oi 
opment of less-addictive replacements for morphine and related drugs 
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vanous enzymes that catalyze oligonucleotide cleavage and ligation regions 9 % * ^ by U " nB 

target-ohgonucleotidehybrid complexes, thetarget-oligonucleotide hybrid corr^ 

bulges, loops, and even single base mismatches can be recognized and cleaved by RNase A 5SK ™Ln, * 

Moreover, ligation reactions can be used to discriminate between fully complementary hybrids and those that m~ 
by one or more base pairs. T4 DNA ligase. for example, can be used to ideSy tSSS^^^^Si 

ZZSZSEt * *! imm ° bili2ed 0,i 9°" uc «^e probes. T*e lijaton m^Si^iS^ 

nudeobdes to he 5 end of oligonucleotide probes on a substrate will occur, in the presence of a HoasTonk . 
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S. Generating An Array Using Light-Directed Methods 

Mm* -»ght dncw maho* (M* w one tenhnique in . fend, of mSiSTSTiSlS! S?, 

S' 2 ^h " ls ° ,ncor P° rat « J herein ^ reference for all purposes. Still further techniques include bead baled iSh 
of fhTiS c , m f Pref6rred for 9eneratin9 an arra y of oligonucleotides on a single substrate The surface 

2^25™' TT' r 0Xidati0n ' *" iS rinsed and the surfa <* iSuminated th o^gh a £££££ to 
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a Generating An Array Of Oligonucleotides Using Flow Channel Or Spotting Methods 

toning Ito flannels „ , s*«3S ™ZZ^*?y^^"' S *^" ,m «*V M »> 
reagents are placet! r, exanaTs^m. =31.,^? ^? appropriate reagents flow or in who, appropriate 

monomer B at the second selected locations In this oJ£to%££ 5^2 2- ^ channel ( s >' bindin 9 

atihis stage of processings^ 

of desired length at known locations oSS e ^' S,epeatedtotaavas,ara ' t, « 
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C. Generating An Array Of Oligonucleotides Using Pin-Based Methods 
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D. Generating An Array Of Oligonucleotides Using Bead Based Methods 

Ser. Nos. 07/762,522 (id S^^SH^^^ TS^c h C ° pendin9 *""«*" 

2. 19*); 07/876.792 {fi.ed ApS. J T^S&JXS Xf SSwiVS^JF* ^ ST ^ 

incorporated herein by reference ^ ' 93)> the dlsclos " r es of which are 

/K Sequencing By Hybridization Using the Probe Tiling Strategy 
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m6th0dS "** ^ lar96 * SCale praCtiCe With a h * h *Q« of confidence in the 
A. Selection of Rafar^ nce Seaimnr*, 

vrtJT!^ d ^ i9ned i° contein P f0bes complementarity to one or more selected reference sequence 

22^?" ^l?^ ,0 iden1ify Path ° 9eniC and/onSre the * 

ganisms acqu,re drug resistance (e.g.. the HIV reverse transcriptase gene). Other reference S^ces ofTn?^ 
■nclude reg,ons where polymorphic variations are known to occur (e.g., the D-loop ^nSS^SSS^^Z 

£™ ,' . . ' ' a d CMV ' Eps1ein Barr v,rus • adenovirus, influenza virus flaviviruses echovim* 

SisrEitr^f^ mumps 

vaccinia vi us, HTLV virus, dengue virus, papillomavirus, molluscum virus, poliovirus rabies virus JC vim* 

com?!J!" 9th ° f 3 refe . re ? Ce Seque " Ce Ca " vary Widelv ,rom a ,ul| - |en 9tn Senome, to an individual chromosome eoi- 
some, gene component of a gene, such as an exon, intron or regulatory sequences, to a few rwdeoBdT?SI^L 
sequence of between about 2, 5. 10. 20, 50, 100, 5000, 1000, 5,000 or 10,000^20,000 or 1M.M0 nuc^ottfeste cot 

otid^RZrDNATex^ 

ones, rna or DNA. For example, sequences can be obtained from computer data bases, publications^ can be deter- 

ft Array Design 
I. Basic Tiling Strategy 

The basic tiling strategy provides an array of immobilized probes for analysis of target sequences showino a hioh 

Sr S uSSo idenmy ; on r r e se,ected re,erence s «*»™ s - ^*™££z%£ 

fr*vTi^ri^h!u^ ^a^i-^ ^ although it will be apparent that in some situations, satisfactory results are ob^a^ned 

ZTlT P S6tS - A Pr ° be S6t C ° mpriSeS 3 p,urali * 01 P robes -»*»« Perfect co^lementerity 2,1 
^ hi SeqUenC& T?,e PerfeCt OTm P len «y dually exists throughout fre length of ^JSTSZir 

probes hawig a segment or segments of perfect complementarity that is/are flanked by (eating or traE seq^es' 

nt^ 

Ir£ il 1? * ♦ ,nterr °9 atl0n P° sition that corresponds to a nucleotide in the reference sequence 

° 9 P ° Srt,0n iS a ' i9ned With the '"Wgwdhfl nucleotide in the reference sequence Ten £ 

* 2 n P° s f n ' «* corresponds wrth a respective nucleotide in the reference sequence. ZZlT<* 2 
IT? P °! rt,0n corres P° ndin 9 nucleotide in a Particular probe in the first probTset cannot be detLTnS 

" ^ ^ ° f ^ " ** Pr0 ° e S6t and ««P™«"B proS from 

In principle, a probe could have an interrogation position at each position in the segment complementary to the 
reference sequence. Sometimes, interrogation positions provide more accurate data whe^ locatSSay from Ze erS 
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fromXeea^^ 

in the reference sequ ™ ^ to iSEJE Z^T^T^^ 9 * "* nuC,eotide * irteres « 
otideofirterest. Usually ftr^obesfro^ 

thefirst probeset wrth one 4Con ^J^S^^S* * ,he »**• from 

occurs in the same position in each of ihl to^SJJ I . ^ ^ ° n ' y one) in,err °9a«°n position, which 
nudeofcieinthefo^ 

position than to the reference sequence 06 S6quenCe mutated at ,he in «errogation 
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lower limit of 25. 50 or 100 probes and an upper limit of 1,000.000, 100 000 10 000 nr mnn nmh~ tk 

have other components besides theprobes such as "^^ 

,?f ^ segments) .n corresponding probes from additional probe sets should be sMtiSSt^toS^l 

the rSSIi? a " Pr0b6S u* 6 8ame ' en9th - ^ arrays em P'°y differe " 9™P* of probe sets in which case 
the probes are of the same size wrthin a group, but differ between different groups. For example some 
group compns-ng four sets of probes as described above in which all the 

be added. Thus, some arrays contain. e.g., four groups of probes having sizes of 1 1 mers. 13 me^5 m Z a*7 7 
mers. Other arrays have drfferent size probes within the same group of four probe sets In hete aTays n 
the M set can vary ,n length independents of each other. Probes in the othe'r sets are usua l" he Sme e gTaT* 
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15 



probe occupying the same column from the first set. However, occasionally different lengths of probes can be included 

■ m^mmm 

interrogation position providing the greatest differential hybridization signal P 9 * ^ 

Some arrays contain arxMjrtf ,xobes or grour* ot probes dafcned to be corrplementarv lo a second rete™™ 

corrmon^urnng mutations of intershain variations. The second group of prabes Is desSneOto lS s?m?D„S 
as desmoid above except that the probes exhibit amflmm^lm^mfmSZZa^S^ 

mutates .re expected to o«ur .rfthin a short distance commensurate with the length ol the probS ( e^S TSU 
m«»ns M. 9 to 2, t»ses). Of course, the sane pundpk. can be extended to p?»£, 

?L ^ K ™ l ^»^.o«'«lnerservesasprobe(s)toracon.entional reverse dot blot For example the 
rrST SZSS"^ * « «• sequence to a single oligomer* £££££ 2 

-Mate PreferaWy. an arttronal probe containing the eouivalent region of the wlBl»pe sequence is included as a 

Mnougf. only a subset ol probes is required to analyze a particular taw sequence « is quite oossirxe thM other 
probe, superflu™* to the collated analysis are also included on the Lyl the ££Z!!Z^SZZ 
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sequence. Such a array also allows simultaneous anaivL ^T~T " bes for ana| y z,n 9 an Y reference 

. (a*, .bsets having the interrogate a IXS^ht^^ ^ d "" rt — * - 

corresponding probes from othl p obe sTs ?ihSS 1 h « P . * ShW 3 * ronBer ^ridization signal than 
stronger hybridization ^KSU^^^S? Pr ° b6S ,r0m the ,irst P'° be still show a 

of interest in the targel sequent SS£2 k ? Pr ° be i' n *" for nucl «>«*> 

one probe from each of A, C, Q and T lanes The nucleotide ,J L«I, * For . exam P ,e - acolumn often contains 
the nucleotide occupying the mem^atio dosS TfH^ u ,ar9e i. s ^ uence 18 'Certified as the complement of 
FIG.ISshowstheKr*^ 

represems the probe^rom thSn ha^^^^^^^^ 
- ^nofd^^^ 

nucleotide occupying me interrogation position of the probJ JepS £25 £ Z^^ A^S^ *5 
the reference sequence is the comnlement m nnri^L*. ~ • 1 ■ 8C ' uare 18 an The first nucleotide in 
Similarly. Ihese^^lJiSS interrogation position of this probe (i.e., a 7). 

sequence is an A. LiS^ 

toread,.^ 

usua7y elSKV^ -* ? tne *»« sequence whereas the others 

greater hybntiization signal tha SESE^^!^ T*!* m3tCh P roduces a 
regions of the targe, s^uence I dSn ^ I"""*- However ' in s °™ 

a call ratio is established to define the ratio of M^2SX, * m ' Sma,Ch is less c,ear Thus ' 

probe that must be exceeded for a Z^ MaS^t I ^ I ^ * ^ SeCOnd best ****** 
few if any errors are made "c^KSSK^ ~ V * "* Ca " ^ ensures that 

vvhichcouldinfactbeaccurately e^ 
cafe. It has been found 
ofbases(e,, upto about"%)m;^ 

from which it was designed one praS JSS 7^117% " ? * the 6X301 reference s W 

must be based on different degrees of mismatch between th? Z^S*^ ? t0,hetar S e tand that any comparison 

the target nucleotide correspoLg to t^ comparison *> es « a ^s allow 

by loss of signal from probes haJg interrogaZ^^^^ can be detected 
.ostf ro m P robesha«^ 
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determined independently Moreover JS«SSS^ 9 , L 8 m&m regions a a ter 9 et s «l uen ce are 

group of probes can beavLe^ « diverse variants «» a single 

can be designed based on a MUeng^ 0n ' ** ° ne 9r0Up * probes 

sequence incorporating frequent., oSrntSS ^ ° n * the 

simu^to^ ™ C0n — methods is the capacity 

nosis of patients who are heTeni^ T"*^ ^ * Va,Uable " «*■ ,or dia 9 

usually present in several polymoS foTs C 1,S ! ?! ' nfeC,ed "* 3 ^ Such as H,V ' * 
cells and surrounding tissu^ ^^esenTLl V » analy2in9 ter9ets from «* tumor 

four probes at the arrl JSErTSSSK ^ ^ *" ° f *« 

of thefour probes forthemixture und^^f^ZS tT * WhlCh d,versity occure - 7116 rela « v e signals 
sequence An incre^ 

cSespondingdecTe: LI ^iS^^T^S,^^ ^ 3 
of a mutant strain in the mixture The extent in i shifl itnhESS^ , . u sequence, signal the presence 
a target sequence in the rnixtu^i^n I ?rl£2HSl iT ^ l ° the pr ° portion of 

reference and mutant sequence by prior cJSSoJS^'^.SS SS?3^ *? « 

refere?^ sequences even when none is identical to the 

would be a I^hiTSS^iS^^^ S T nC6S beari " 9 ,irSt and second mutation * 
second mutations relative to^^^^^ P °^ 0ns ^ponding to the first and 

having a mismatched interro*^ sequence. At each position, one of the probes 

s^al.andtheprobeha^m*^^ 

2. Block Tiling 
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to a probe from the first pr^seS^^ T * ?*" *" Uo * ^ Stra,egy is «***nl 

equivalent to probes from th sSon I thS l^^,fl T 4 66 m,smatehed P r *» in set A in block tiling are 
probes in set B of blo^ ^ !" *» «** ^ three mismatched 

fifth, sixth and seventh probe IL Zl^^Z^^ *?* t T * basic ti,in 9 »i.y designated the 

interr^^rra^ 

secondadvart^eisthatichdS^^ l, *T' on « te *■*«» fr ™ probes a 

probe from eachrttheotheprXs^ 
segment containing the interrcUonS 

eg/ for solid phase synthes^ 
onthearray, thereby * »urn increasi^^ 

V. Enzymatic Discrimination Enhancement 

and ligation reactions * by USmg Van ° US 6nzymes ** oligonucleofide cleavage 

A Enhanced Discrimination Using Nuclease Treatment 

torn, ^UlJCiSSSiS?' ""I 8 lab9 " M , "<* nuc,8lc aca <° 
sequence of ffie target nucleic add STOeotoe arrays and, in turn, to more accurately determine the 
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to be <dentif,ed are m dose proximity to one another (i.e.. so that they are in predefined regions which are near tc 

shouW be no ed that corrfocal detection allows hybridization to be measured in the presence of excess labeStarae 
and. hence, if desired, hybridization can be detected in real time. 9 

^Sh^^f ati ° n, £ e SUbSfra,e haVing 30 ^ * ,ar 9 e,: "'^nucleotide hybridization complexes thereon is 

SIS? T ? , ea ! a ^ iS m08t Simp ' y carried 0ut by addin 9 a solufon ° f «he nuclease to thence dil 
substrate Altemaftvely. however, this can be earned out by ftowing a solution of the nuclease 

us.ng the buffer used to carry out the hybridization reaction (i.e.. the hybridization buffer) The conceSn TtS 

™E« r er ' T 6 * m ! WhiCh th6 array * ^^oligonucleotide hybridization complexes is in cStect wS The 

SurPniH^T 1 £ C ? n . UC,eaSe treatn1en, iS ^ 0Ut for a of time rang^ from iou^ minu^s to I 
houra Following treatment wrth the nuclease, the substrate is again washed with me hybridization bSr and Sfc? 

m^^ 

hi„h? SU ?" ? UClea ? treatment ran be UBed followin9 hybridization to improve the quality of hybridization signals on 
h.gh dens.ty o gonucleotide arrays and. in turn, to more accurately determine the sequence of tErSS 

t» m iT «.7™ er f ^^f*" 1116 present Mention provides a method for obtaining sequencing information about an unlabeled 

<2S2£EZZZ <rT 9: ( ? "T** a " Un,ab6,ed ,ar9et o*™*** with a library of labeled oi^ 
cleobde probes, each of the oligonucleotide probes having a known sequence and being attached to a solid su3rt at 

S252f braryi ? C T* *• hybridiZ6d library wHh 8 nuclease «■"*«• ° f *«i double s^ anS oS 
»^ f h ^ elease h from ,he librar V a portion of ,he labeled oligonucleotide probes or fragment *SZ- 

remo^L f^r 9 ' * Sr""" 01 h,Wiffid ,ibrary ,rom which label ^ P«*- or foments thereof have beln 
removed, to determine the sequence of the unlabeled target oligonucleotide 

a t SPeC L°' ? in "f i0 o a library ° f oli 9° nucleotid e P^es is prepared, for example, using the VLSIPS™ 
tochnotogy describe above (See, Section III. supra). Once the library of probes has been prepared, the 5" termfnus ; of 

SscS S 3 6 label such as those described in Section V ' **» ****** th < ^ a 

h^iJ^i^I^ ° f Jf eled ° li9 ° nucleotide P robes is then contacted with an unlabeled target oligonucleotide The unla- 
beled oligonucleotide can be synthetic can be isolated from natural sources. In preferred embodiments the unlabeled 
oligonucleotide is genomic DNA or RNA. For example, purrfied DNA or a whole-cell digest which has been paS 
sequenced can be l.ghtly fragmented (e.g, by digestion wrth a restriction enzyme whic ^providefinfriue^ Sd 

us.ng a column containing probes complementary to a part of the sequence of interest. The complementary fraamente 
by heat or by chemical means) and contacted with the library of probes 

i«fi2?I e 'T ° f pr ° bes u has been w«h »he target oligonucleotide under conditions sufficient for hybrid- 

ization to occur, the resutong hybridized library is contacted with an appropriate nuclease enzyme AHernatahMhe 
nudease can be introduced to the library in the same mixture as the targetoligonucleotide. TheTudease 5^ be Z 

^ei'S 

nn JIl,?^ ? iZ6d Ji b '7 ^ iC u h3S b6en C ° n,aCted * me nuclease is 1hen ^hed to remove the label from those 
positions wherein hybnd.zat.on has taken place. By scanning the washed library with a detector to determine the d7s 
enceorateenceoflabelsinar^ 

9 T TJ^™ ' V ' ^ mUtati ° n deteC,ion and 0,her ^binatorial methods. Other^va^g s So Te 
presen method, .ncluding (i) the use of un.abeled target oligonucleotide, which simplifies target prepaSn ™ a!ows 

Znr^T* *■? (H) *• USe 01 3 ™* ° f nuCl6aSes which <»" be seleL'forSng the tS 
and probe^ the probe alone, or probe-probe interaction, and (iii) application using existing VLSIPS technology. 

The foregomg enzymatic discrimination enhancement methods can be used in all instances where improved dis- 
cretion between ^ 

particularly, such methods can be used to more accurate.y determine the sequence (el de r^sequlnLTor 
mentor mutations^ resequence the target nucleic add (i.e.. such methods can be used i ooi^3^2L5 
sequencing procedure to provide independent verification). 
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™etet?rt*Z!!Z«r 7? 10 diSCrin !f? b6tWeen ,U " y com P ,eme "^ hybrtfs and those that differ by one or 
more base pars. More particularly, an array of oligonucleotides is generated on a substrate fin the 3" to S' 

f~^~ 

E ! 77 ? ° f «^™««>^ hybridization complexes hToSS with the S iua t 

SaX ^? m T^ 8 ^ 3 ^ * ,ime rangi " 9 fr0m minutes to n-dreds of* ho S * 

of J« ^ en *^ ,ment . «» present .nvention provides another method which can be used to improvediscrimination 
of base-pa-r rmsmatches near the 5' end of the immobilized probes. More particularly the iwSZSSZS^ 
method for sequencing an unlabe.ed target oligonucleotide, the method comprising: w3S ^ZSSSS, 
pnsing an array of positionally distinguishable oligonucleotide probes each of ^vhich "h«a^^?^ ?T 
region, the variable region capable of binding to a defined subsequence 2£^£T£E^^ 
oSSES" a r UenCe ^ " to *. constant ^5SSS3^^?S a 

preselected length, (c) removing unbound target nudeic acid and ladled, 52?^^ 

determm.ng wh,ch of the oligonucleotide probes contain the labelled, ligatable oiJonucKS^^SS 

X22Z2^V£^ n r° tides in len9th - The ,abel,ed ' li9atab,e o*^2£^^ 

preselected length, and the pool of such probes represents all possible sequences of the preselected lenofh Th„ c » 
the probe,s6nucleotidesin length. «ll possible 6-mers are present in thepS. As 2i«S5HjS^"n^ 
2 t* - ?T * formati ° n ° f 9 phos P hodiester at th'elite ofTsi hSSSSSK 

'iSSE; ^ ' ^ ^ Hmited * T4 DNA li93Se ' "' 9ases isolated from £ «• a^ ligases isolaJS from 

Ip 225 f h C °^ 0nS ' bUt Wi " typiCa " y ran9e from 500 units/ml •> about 5^)00 unS^MoZer Se 
larv xJf, I 1 T y 0 ' taf9et 0,i 9° nucleo «^:oligonucleotide probe hybrid complexes is in contact wi theZse 2 

ESJEfF Ik 9aS6 H- reatmem iS C ' ni8d ° Ut for 3 P6ri0d 0< time ran9in 9 from from ™*<* t MdZ! 
In addrtion rt will be read.ly apparent to those of skill that the two ligation reactions can either be towmM^ 

a 4 12«^!^ a ^ 0d ■ ^J 81 " B - at,0n reacllon » the 5- end of the target dloonuclaotide (/ e thelast 

ttEffr h T r,able r89IOn * ^ oli 9° nucleo «de probe. Similarly, the second ligation reaction U ch a*ls 
a label to the probe. w,ll occur efficiently only if the first ligation reaction was successful and if SSfaaJta^fS 
complementary to the 5' end of the probe. Thus, this method provides for e^^atboSJSoli^^So? 
Moreover,*^ 

speaficity and removes the necessity of labeling the target. increases proDe.iarget 
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endJth^ - "-P-r mismatches near the * 

enhancememdiscr^^^ 

discrimination between fully comptomentaj^^ 

More particular., such meLs Sete^ P f WOuW be ^ 

or monitor mutations, or reseauence th P taroof „, =J£v aeiermine tne sequence (e.g., de novo sequencing), 
sequencing prc^ue to 

foregoing is intended to illuSe J V * ^ * sW " in the art tha1 the 

can be treated with a ligaseTnd mX^ISSXSI * tar9et:oli 9 onuc,eot ^ »*" complexes 

oligonucleotide arrays ' 9ataWe pr< * eS to ,mprove ^ization signals on high density 



W. Detection Methods 



15 



20 



belo^^ 

cedures are used to determine the Xn^ Standard P r °" 

For example, if a target sequence is M^mJSSS^ nS £ J^T"","? * W tate 
locations where the oligonucleotides intend wHh ., !l * ol.gonucleotde probes, only those 
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Desirably, fluorescers should absorb light above about 300 nm preferably ah™ * v» „ m „ * 
about 400 nm, usually emitting at wavelengths areater than 231^7 ! L ' andra0repre,erab ^ al,we 
It should be noted that the alLtfton ^ 0,the '^absorbed. 

Therefore, when referring to the So J ^«JZZ "^SS?^1 ■? U ^ dy8 Ca " ^ fr ° m * e unbound <** 

-'e^ «~ Chemiluminescent 

serves as thedetectibLgnal or don^ 
have been found to providLhemLmSr'S 

family indude the 5-aminc^7 «^ Otar members oYthe 

to luntnesce v«th alkaline hydrogen perSdeS i^TES^^ ^ "W* 030 be mad * 
2.4.5-triphenylimidazoles. with loTn ^ Z^TJ^T "* b88a Anrther family * c <™P<™ds is the 
pMmrtqLno and ^tSTu^ 
active esters eo., p-rttropheSSeS f 0 ^1^ ^" *° be 0Wained "" h oxalates - usuallv °* a 'W 

is directly converted to a database indicating what ^n^- J ^ * ? interaction. Thus, the positional information 
hybridan application^ 

can be directly listed from the positional infemJ^ rZ 1 9 matrix and ,he ,ar 9 e « mo| ecule 

WO90/15070; and US.S™^^ * h PCT P ublica «°" •» 

can be replaced by a veOnM^^^^^^'^^ 1 ^ is a fluorescence detector, thedetector 
to a fixed substrate" .ESSE wrtfa mov SJJbS^SSS T" ^ ° f 3 ^ detector relalive 
can be used to transfer the signal directly toT 0r f er a « 

herein by reference. 0 " U b S N - 07 ^24, 120. which is hereby incorporated 

pa^S^ 

actual positive signal may tend to^^ & 7^nZT'Zr^ Fo , r . exam P |e ' a si 9 nal fr ™ * region which has 
not have one. Tfi may m»J^S!Z ZnZaZZT^ "! "l"^ re 9 ion whi <* actually should 
olution in its pixel density to awam^tlJ ST tTI * T^"* d,scriminatin 9 witn sufficiently high res- 
by pixel to determine t XtwK^T f! ?" ^ ^ may be Med 
a uniform signal at each pixel Ston t7J^S£2J£ T„ 3 . * tme P ° Sitive si£,nal should ' in theor >'- * » 

with a computer the information ne*d ShSSSdSS *! !^T 9 "* -8m interfaces 

of data with very little human intervene ?*!^^^ESE^^ * t0 h>n * am ° Unte 
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the ^ ta m a a naly ! i l Can ^ Pertbrm6d manUa " y Pfeferably - b * 3 usi "9 a appropriate program Althouoh 

the specrfic man.pufat.ons necessary to reassemble the target sequence from fiagmentemav take ml™ II T. 

Generally, such computer programs provide for automated scanning of the substrate to determine the nn «« n „ c „, 
o^onucleotrie and target interaction. Simple processing of the intensity of the sSma be ncoZr^Se out 
clearly spurious s.gnals. The positions with positive interaction are correlated with the s^ai sS£ of ^ 
matnx posrtons, to generate the set of matching subsequences. This informationLS^ 

2„T m1 °T 0nt ^ reStriC,i0n ,ra9ment ana,ySis - The *«V™<* « a gn* u"ng 
lead.ng to poss,ble corresponding target sequences which «, optimaNy, correspond to a 5,?^^ 

VII. Applications 

;ng procedures may be applied to polypeptide, carbohydrate, or other polymers. Such meLs c^Se us«l n a i 

S^Z^fT^IT^ ,U, ' y - ftosTSftrby onet ™ 

base pairs would be helpful. More particularly, such methods can be used with de novo seauencina or in cJhJ**Z 

2 Ti a 2 ^T endng , procedure to pravide independent — <^ »££5S1£ SS 

hv J B ? ( f ? eXampl8, 3 ' ar9e P**™*"** defined by eiiher the Maxam and Gifbert tShntue ? 

by the Sanger technique may be verified by using the present invention technique or 

| M by S6leCti0r ! °' appr °P n ' ate a Polynucleotide sequence can be fingerprinted Fingerprint™ is a 

ess . detailed sequence analys,s which usually involves the characterization or a sequence i^ZSSS^SSi 
SZn S ^ ue t nce M ,in 9 er P" ntin 9 is ocularly useful because the repertoire of pVsable feLesS can be ££ 
fT"' ,ha f in 9 ency of is also variable depending upon the appHcation A Souftem 

Blot analysis may be characterized as a means of simple fingerprint analysis W 

, erPnntin ? analySiS may be P erformed to fte resolution of specific nucleotides, or may be used to determine 
sizSht^ 

or homology to the desired level of stringency using selected hybridization conditions 9 " 
In addition, the present invention provides means for mapping analysis of a target sequence or sequences Manoinn 
« usually involve the sequential ordering or a plurality of various sequences, or m*3?l£KES£ 5?5? 
h^lf^T 3 P,Ura,ity ° f Sa,U6nCeS - ThiS be achieved b V immobilizing p^EJlS^SE 
Alternatively, relatively shorter probes of known or rarxJom sequence may be immobilized to ft mat™ fZ^Z ol 
^ZT^T 9 * S "lT CeS ^ be determined fr0m mer,a ^- Prin ^ es ° f «* an approSare d^Z in 

/»caa 5a 86.5030-5034; Michiels. et a/., "Molecular Approaches to Genome Analvsis- A Strata™ w «,= 
struction of Ordered Overlap Cone Libraries," CMODSS^io (1987); ™af?Si^?^^ 
Genom.c Reaction Mapping in Yeast," Proc. Natl. Acad. Sci. USA 837826-7830 (1986 T^mxSES* Z 

zZ SEI "Tf 9 !? SimP,ftX ^ TyPe " ^ G6n0me: A Srpl^gb ^ W 

zabon A//;c. Accfe ftes. 18:2653-2660 (1990); and Coulson, « a/., Toward a Physical Map of the Genome SZ 

SKSE^^ Pw - wa "- 4cad Sc,: ^ 837821 - 7825 ^ - - 22 

Fingerprinting analysis also provides a means of identification. In addition to its value in apprehension of criminals 
from whom a biological sample, e.g.. blood, has been collected, fingerprinting can mrnn^S!^^^ 
other reason* For example, it may be useful for identification of bodies in tragedies such STtSTiS^ 
crashes, pother cases the identification may be useful in identification of persons suffering from amnesT' ^or oJmSa 
persons. Clher forensics applications include establishing fteidentHy of 
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or may be used in identifying the source of particular biological samples. Fingerprinting technology is described, eg., 
in Carrano. era/., "A High-Resolution, Fluorescence-Based. Semi-automated method for DNA Fingerprinting," Genomics 
4: 1 20-1 36 (1 989), which is hereby incorporated herein by reference. 

The fingerprinting analysis may be used to perform various types of genetic screening. For example, a single sub- 
strate may be generated with a plurality of screening probes, allowing for the simultaneous genetic screening for a large 
number of genetic markers. Thus, prenatal or diagnostic screening can be simplified, economized, and made more 
generally accessible. 

In addition to the sequencing, fingerprinting, and mapping applications, the present invention also provide, means 
for determining specificity of interaction with particular sequences. Many of these applications are described in U S S N 
07/362.901 (VLSIPS parent). U.S.S.N. 07/492.462 (VLSIPS CIP), U.S.S.N. 07/435.316 (caged biotin parent), and 
U.S.S.N. 07/612.671 (caged biotin CIP). which are incorporated herein by reference. 

VIII. Libraries of Unlmotecular, Double-Stranded Oligonucleotides 

In one aspect, the present invention provides libraries of unimolecular double-stranded oligonucleotides, each mem- 
ber of the library having the formula: 

Y_L 1 -X 1 -L 2 -X 2 

in which Y represents a solid support, X 1 and X 2 represent a pair of complementary oligonucleotides. L 1 represents a 
bond or a spacer, and L 2 represents a linking group having sufficient length such that X 1 and X 2 form a double-stranded 
oligonucleotide. 

The solid support may be biological, nonbiological. organic, inorganic, or a combination of any of these, existing as 
particles, strands, precipitates, gels, sheets, tubing, spheres, containers, capillaries, pads, slices, films, plates, slides, 
etc. The solid support is preferably flat but may take on alternative surface configurations. For example, the solid support 
may contain raised or depressed regions on which synthesis takes place. In some embodiments, the solid support will 
be chosen to provide appropriate light-absorbing characteristics. For example, the support may be a polymerized Lang- 
muir Blodgett film, functionalized glass, Si, Ge. GaAs, GaP, Si0 2 , SiN 4 , modified silicon, or any one of a variety of gels 
or polymers such as (poly)tetrafluoroethyiene ( (poly)vinylidendifluoride, polystyrene, polycarbonate, or combinations 
thereof. Other suitable solid support materials will be readily apparent to those of skill in the art. Preferably, the surface 
of the solid support will contain reactive groups, which could be carboxyl, amino, hydroxy!, thiol, or the like. More pref- 
erably, the surface will be optically transparent and will have surface Si— OH functionalities, such as are found on silica 
surfaces. 

Attached to the solid support is an optional spacer, L 1 . The spacer molecules are preferably of sufficient length to 
permit the double-stranded oligonucleotides in the completed member of the library to interact freely with molecules 
exposed to the library. The spacer molecules, when present, are typically 6-50 atoms long to provide sufficient exposure 
for the attached double-stranded DNA molecule. The spacer, L 1 , is comprised of a surface attaching portion and a longer 
chain portion. The surface attaching portion is that part of L 1 which is directly attached to the solid support. This portion 
can be attached to the solid support via carbon-carbon bonds using, for example, supports having (poly)trrfluorochlo- 
roethylene surfaces, or preferably, by siloxane bonds (using, for example, glass or silicon oxide as the solid support). 
Siloxane bonds wfth the surface of the support are formed in one embodiment via reactions of surface attaching portions 
bearing trichlorosilyl or trialkoxysilyl groups. The surface attaching groups will also have a site for attachment of the 
longer chain portion. For example, groups which are suitable for attachment to a longer chain portion would include 
amines, hydroxy!, thiol, and carboxyl. Preferred surface attaching portions include aminoalkylsilanes and hydroxyalkyl- 
silanes. In particularly preferred embodiments, the surface attaching portion of L 1 is either bis(2-hydroxyethyl)amino- 
propyhriethoxysilane, 2-hydroxyethylaminopropyltriethoxysilane, aminopropyitriethoxysilane or 

hydroxypropyrtriethoxysilane. 

The longer chain portion can be any of a variety of molecules which are inert to the subsequent conditions for 
polymer synthesis. These longer chain portions will typically be aryl acetylene, ethylene glycol oligomers containing 2- 
1 4 monomer units, diamines, diacids, amino acids, peptides, or combinations thereof. In some embodiments, the longer 
chain portion is a polynucleotide. The longer chain portion which is to be used as part of L 1 can be selected based upon 
its hydrophilic/hydrophobic properties to improve presentation of the double-stranded oligonucleotides to certain recep- 
tors, proteins or drugs. The longer chain portion of L 1 can be constructed of polyethyleneglycols, polynucleotides, 
alkylene, polyalcohol, polyester, polyamine, polyphosphodiester and combinations thereof. Additionally, for use in syn- 
thesis of the libraries of the invention, L 1 will typically have a protecting group, attached to a functional group (i.e., 
hydroxy!, amino or carboxylic acid) on the distal or terminal end of the chain portion (opposite the solid support). After 
deprotection and coupling, the distal end is covalently bound to an oligomer. 

Attached to the distal end of L 1 is an oligonucleotide, X 1 , which is a single-stranded DNA or RNA molecule. The 
oligonucleotides which are part of the present invention are typically of from about 4 to about 100 nucleotides in length. 
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Preferably X 1 is an oligonucleotide which is about 6 to about 30 nucleotides in length. The oligonucleotide is typically 
linked to L via the 3'-hydroxyl group of the oligonucleotide and a functional group on L 1 which results in the formation 
of an ether, ester, carbamate or phosphate ester linkage. 

Attached to the distal end of X 1 is a linking group. L 2 . which is flexible and of sufficient length that X 1 can effectively 
hybndize with iX z . The length of the linker will typically be a length which is at least the length spanned by two nucleotide 
monomers and preferably at least four nucleotide monomers, while not be so long as to interfere with either the pairing 
J. ^ J* a " y subse< ' uem "I* linWn 9 gro"P teelf will typically be an alkylene group (of from about 6 to 
about 24 carbons in length), a polyethyleneglycol group (of from about 2 to about 24 ethyleneglycol monomers in a linear 
configuration), a polyalcohol group, a polyamine group (e.g.. spermine, spermidine and polymeric derivatives thereof) 
a polyester group (e.g., poly(ethyl acrylate) having of from 3 to 1 5 ethyl acrylate monomers in a linear configuration) a 
polyphosphodiester group, or a polynucleotide (having from about 2 to about 12 nucleic acids). Preferably, the linkinq 
group w. be a polyethyleneglycol group which is at least a tetraethyleneglycoJ. and more preferably, from about 1 to 4 
hexaethyleneglycols linked in a linear array. For use in synthesis of the compounds of the invention, the linking group 
will be provided with functional groups which can be suitably protected or activated. The linking group will be covalently 
attached to each of the complementary oligonucleotides. X 1 and X 2 . by means of an ether, ester, carbamate phosphate 
ester or arnme linkage. The flexible linking group L 2 will be attached to the S^-hydroxyl of the terminal monomer of X 1 
and to the 3 -hydroxyl of the initial monomer of X 2 . Preferred linkages are phosphate ester linkages which can be formed 
m the same manner as the oligonucleotide linkages which are present in X 1 and X 2 . For example, hexaethyleneglycol 
can be projected on one terminus with a photolabile protecting group (i.e.. NVOC or MeNPOC) and activated on the 
other terminus with 2-cyanoethyl-N,N-diisopropylamino-chlorophosphite to form a phosphoramidite. This linking group 
can then be used for construction of the libraries in the same manner as the photolabile-protected. phosphoramidite- 
actovated nucleotides. Alternatively, ester linkages to X 1 and X 2 can be formed when the L 2 has terminal carboxylic acid 
moieties (using the S'-hydroxyl of X 1 and the 3-hydroxyl of X 2 ). Other methods of forming ether, carbamate or amine 
linkages are known to those of skill in the art and particular reagents and references can be found in such texts as March 
Advanced Organic Chemistry, 4th Ed.. Wiley-lnterscience, New York. NY. 1992, incorporated herein by reference ' 
nM Jne ol'gonucleobde. X z , which is covalently attached tothedistal end of the linking group is, like X 1 . a single-stranded 
DN A or RNA molecule. The oligonucleotides which are part of the present invention are typically of from about 4 to about 
100 nucleotides in length. Preferably. X 2 is an oligonucleotide which is about 6 to about 30 nucleotides in length and 
exhto-ts complementarity to X 1 of from 90 to 100%. More preferably, X 1 and X 2 are 100% complementary. In one group 
of embodiments, either X 1 or X z will further comprise a bulge or loop portion and exhibit complementarity of from 90 to 
1 00% over the remainder of the oligonucleotide. 

In a particularly preferred embodiment, the solid support is a silica support, the spacer is a polyethyleneglycol con- 
jugated to an aminoalkylsilane. the linking group is a polyethyleneglycol group, and X 1 and X 2 are complementary oli- 
gonucleotides each comprising of from 6 to 30 nucleic acid monomers. 

The library can have virtually any number of different members, and will be limited only by the number or variety of 
compounds desired to he screened in a given application and by the synthetic capabilities of the practHioner In one 
group of embodiments, the library will have from 2 up to 100 members. In other groups of embodiments, the library will 
have between 100 and 10,000 members, and between 10.000 and 1.000,000 members, preferably on a solid support 
In preferred embodiments, the library will have a density of more than 100 members at known locations per cm 2 pref- 
erably more than 1 .000 per cm 2 , more preferably more than 10,000 per cm 2 . 

Preparation of these libraries can typically be carried out using any of the methods described above for the prepa- 
ration of oligonucleotides on a solid support (e.g., light-directed methods, flow channel or spotting methods). 

IX. Libraries of Conformational^ Restricted Probes 

In still another aspect, the present invention provides libraries of conformationally-restricted probes. Each of the 
members of the library comprises a solid support having an optional spacer which is attached to an oliqomer of the 
formula: 

_X"_z_x 12 

in which X 1 1 and X 12 are complementary oligonucleotides and Z is a probe. The probe will have sufficient length such 
that X and X form a double-stranded DNA portion of each member. X 1 1 and X 12 are as described above for X 1 and 
X respectively, except that for the present aspect of the invention, each member of the probe library can have the same 
X and the same X . and differ only in the probe portion. In one group of embodiments, X 1 1 and X 12 are either a poly- 
A oligonucleotide or a poly-T oligonucleotide. 

As noted above, each member of the library will typically have a different probe portion. The probes. Z. can be any 
of a variety of structures for which receptor-probe binding information is sought for conformationally-restricted forms 
For example, the probe can be a agonist or antagonist for a cell membrane receptor, a toxin, venom, viral epitope 
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hormone, peptide, enzyme, cofactor. drug, protein or antibody. In one group of embodiments, the probes are different 

SS«' n ^ abCUt 4 to * out 12 amin ° adds - Preferably ,he P robes wi " »* linked via polyphosphate 

diesters. although other linkages are also suitable. For example, the last monomer employed on the X" chain can be 

rf„lT^l° f ^' fU ? Cti0na ^ ed . Ph0Sph0ramidite " UCle0tide (availaWe from Glen Search. Sterling. Virginia. USA or 
S5* ^"t 9168 - 2 9 Wo * lands - Texas ' USA ) •«* «■ P™** a synthesis initiation site for the carboxy to 
iffi^^ 0 "'? ^ ^ iS formed - 3 8, - U0 *WW- nucl ^e (from CruachL, 
SH?" J k ^' } , W "' ^ add8d mder P6ptide couplin9 condi,ions - ln yet an °« h * group of embodiments the 
probes will be oligonucleotides of from 4 to about 30 nucleic acid monomers which will form a DNA or RNA hairpin 
structure. For use .n synthesis, the probes can also have associated functional groups (/.e.. hydroxy! amino carboxSic 

SS y ndder ^ 

The surface of the solid support is preferably provided with a spacer molecule, although it will be understood that 
desaiS r atov? TJ" "* * "* * ^ ''^^ ^ ,he Sp3Cer m0,ecules wi " be as 

The libraries of conformational^ restricted probes can also have virtually any number of members As above the 

on^h 71 T Wl " °" ly by deSig " 01 the particu,ar 8creenin B assay for which «» library will be used 

and by the synthetic capabHrt.es of the practitioner. In one group of embodiments, the library will have from 2 to 100 

TO^nnn her ° f e ^ ir "ents, the library will have between 100 and 10.000 members, and between 10 000 

mo JdJ ! 1 ™T MS ° 88 in Preferred «*«*™l*> « ha "brary will have a density of more than'lOO 
members at known locations per cm 2 , preferably more than 1000 per cm 2 , more preferably more than 10.000 per cm 2 
Preparation of these libraries can typically be carried out using any of the methods described above for the prepa- 
ration of oligonucleotides on a solid support (&$., light-directed methods, flow channel or spotting methods). 

X. Libraries of Intermolecular, Doubly-Anchored, Double-Stranded Oligonucleotides 

In another aspect, the present invention provides libraries of intermolecular, doubly-anchored, double-stranded oli- 
gonucleotides, each member of the library having the formula: 



y? — l 2 - 




In this formula, Y represents a solid support. X 1 and X 2 represent a pair of complementary oligonucleotides, and 

11 . tf r ! P v r ! Sent 3 b ° nd ° r 8 Spacen Typically " L1 and L2 are * e same and are sP^ers having sufficient length 
such that X and X can form a double-stranded oligonucleotide. The non-covalent binding which exists between X 1 
and X d is represented by the dashed line. 

The solid support can be any of the solid supports described herein for other aspects of the invention. Attached to 
the sohd support are spacers. L 1 and L 2 . These spacers are the same as those described above for the unimolecular 
double-stranded oligonucleotide embodiments. Preferably, the spacers are comprised of a surface attaching portion' 
which is a hydroxyalkyltriethoxysilane or an aminoalkyltriethoxysilane. and a longer chain portion which is derived from 
a polyethylene glycol). 

Attached to the distal ends of L 1 and L 2 are X 1 and X 2 . respectively. X 1 and X 2 are each a single-stranded DNA or 
RNA molecule. The oligonucleotides which are part of the present invention are typically of from about 4 to about 100 
nucleotides in length. Preferably, X 1 and X 2 are each an oligonucleotide of about 6 to about 30 nucleotides in length 
The oligonucleotides are typically linked to L 1 or L 2 via the 3'-hydroxyl group of the oligonucleotide and a functional group 
on L which results in the formation of an ether, ester, carbamate or phosphate ester linkage. 

In one group of preferred embodiments, X 1 and X 2 are complementary oligonucleotides of about 6 to about 30 
nucleotides in length, and exhibit complementarity of from 90 to 100% over their entire length. Arrays, or libraries of 
these double-stranded oligonucleotides can be used to screen samples of DNA. RNA, proteins or drugs for their 
sequence-specific interactions. 

In another group of preferred embodiments, the S'-terminal region of X 1 (the distal portion with reference to the solid 
support) willbecomplementary totheS'-terminal region of X 2 (thedistal portion, again with reference to the solid support) 
For example. X and X* can each be an oligonucleotide of from about 10 to about 30 nucleotides in length The 5' end 
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It TJZTIT *T 4 ? ^ 20 nUCle0,ideS WWCh be **#"™*Y to the 5' end of X* (see FIG 
A?rL^, , r th6 < d 5? ree of complementarity will typically be from about 90 to about 100% preferably abouM00% 

«e^ 

, mAmwmmm 

can JSSSF?!S*;Ii >iP ? !, t fl ' the 1 irW ! nti0n iS hybridi2ati ° n "fr™™-*. ™> ^ illustrated in FIG. 10G. As 
SS^Ti^ J .? y ° ,ntermolecula '. doubly-anchored, double-stranded oligonucleotides is oranred as 

to further enhance the discrimination of any 3' mismatches. pnospnoryiatea; 
The libraries of this aspect of the invention can also have virtually any number of different members and ho 

ussy* ■yj - number or variety ° f compounds desired to he * • (^STSssriSi; 

capab.lit.es or the practitioner. In one group of embodiments, the library will have from up Z Zi members TotTer 

SSn 6 " 18 ' ^ Wi " haVS b6tWeen 100 and 10 ' 000 membe * «» *£S. « 000 aSd 1 1 So 
members, preferably on a solid support. In preferred embodiments, the library will have a density of more than inn 
members at known locations per cm* preferably more than 1 ,000 per cm* more prefeSy more Sa H JSSd? 

Preparation of these libraries can typically be carried out using any of the m^^SSd^SiS^' 
ra on of oligonudeotideson a solid support (e*. light-directed methods. flowTSnS ?££ ^S^SS 
heohgonucleotides and X* will be synthesized as a pair in each cell of the library. Such s^th^ 
^atsynthes.s.n*^ 

pie. a solid support { e.g.. a glass coverslip) can be modified with a suitable linWng group (e g Z SSbSSS 
lane, or the mono tnethoxysilylpropyl ether of a polyethylene glycol having an apprTopS £X^!3S!S^ 
S2' Ch ^re Present following the attachment of the linking groups can he unLmly protected wH vSSoSS 

So M^f ^ ^ USSd 10 depr ° teCt *°« ha * * ,he hydrox y whic " are subsequenjy £5£ as 

DMTo MMT(mono-methoxytrrtyl) ethers. In this manner, each cell or portion of the solid support wW have ^ 
equivalent numbers of two linking groups being ir«ie P enden«y removable protecting groups^ 
then proceed .n a straightfonrard manner by removing the MeNPOC groups (by irradiation) in one ceH an doSZZI 

In JS12 "I 69 ' 0 " 8 !" Pr0C6ed in 8 SimHar ™ nt]et t0 produce * e libra ries °f «• aspect of the ESn 

ilSZ' ^ ° f SyntheSIS ,0 ' ,0Win9 *■ initel StepS t0 divide *« avai,able si,es W» i deSS 

Sy^ 

X/. Afef/jocfe Screen//^ iftvar/es of Double-Stranded Oligonucleotides and Probes 

A library | prepared according to any of the methods described above can be used to screen for receptors havina 
high affinity for unimolecular. double-stranded oligonucleotides, intermodular. doubly-anchor^ SSSSS? 
gonucteobdes or conformationally restricted probes. In one group of embodiments, a sol^o7'comaining a^ktd 
(labeNed) receptor ,s introduced to the library and incubated for a suitable period of time. The library^ heTwash™ f £ 

by dentfying those regions on the surface of the library where markers are located. Suitable markers include bui are 
the ZT.T da T & - Ch T° Ph0reS ' ,,U ° r0ph0reS - chemil " m '"^ent moieties and transition ^SSS^S 

eJ^e^nedtt^ 

can be exposed to a solution containing marked receptor such as a marked antibody. The receptor can be marked in 
receptor, the surface is placed proxmate to x-ray film or phosphorimagers to identify the antigens that are resized 
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mm mtmd ograph, is tta detr^or, method ..Hd, Ire marker isarMoaoHvelabeUuchasKpuremarteroo 
SLSSr,?*^- """ " ' <*°«>'»' i ™>!>* «** * developed and read ou, on , *Lner TrZSu* 

■J^SiS^!^ Inadirect 

^^^^^^^^■^^^ 

Hood Immunology. Benjamin/Cummings (1978). and E. Harlow etal.. Antibodies A Lateral Ma^C^ 

restricted probes on predefined regions of a surface as described above. An unmarked first receptor is Zn touri to 
ecrC^ 6 "T 9 3 bindin9 for the rece Pt°rs. A 

uZfnn ™ ^ S !? P 0 *"* 1 10 the SUrtaCe and incubated for a suitable The surface is the^edTee^f 
unbound reagents and the amount of marker remaining on the surface is measured. In anrtto farm rt Petition 

re9IO "n 1 he SUrf8Ce ^ be re ' a,ed ,0 916 amount * unknown "-P* h solu ta YeTanotnerfoTo, 
I"" havi "9 Cerent labels, for example, two different chromop ores 

»Hh L^lL ? S : u ° rter t0 detec, receptor bindin 9" tne double-stranded oligonucleotides which are formed 

T t£ ilt f Pr0beS ,° r ^ a l^ e ,inWn9 9r ° UP Wi " be trea,ed with an intercala «n9 dye, pJX^SSS 
dye. The hbrary can be scanned to establish a background fluorescence. After exposure of the libSy to a mSto 
solufoa the exposed library will be scanned or illuminated and examined for those areas hSStailSE 
ctanged.^tematve^the receptor of interest c^b^ 

In instances where the libraries are synthesized on beads in a number of containers the beads are exoosed to a 
receptor of .nterest. In a preferred embodiment the receptor is f luorescently or m/omM^^^iS^i 
morebeadsare identified that exhibit significant levels of, for example. fluorLence using one^ 
For example, in one embodiment, mechanical separation under a microscope is utilized The t2mXS 
on^esurfaceofsuchseparatedbeadsisthen identified using^ 

and sequencng of the associated DNA. or the like. In another embodiment, automated sorting (7e f^SSST 
vated cell sorting) can be used to separate beads (bearing probes) which bind to <«**X£Z£S^ 

KJ^^o^S* ' abeled rK^"^ * ^ diSC ' 0Sed in Needels « »TN a T A ? a ?sct 
usa go. 1 0700- 1 0704 (1 993). incorporated herein by reference. 

The assay methods described above for the libraries of the present invention will have tremendous aoDlication in 
fd^l*^^^ 

nM4 double-stranded DNA in the presence of a putative DNA binding protein Gel analysis of cut and protected 

intensive See, Galas et al.. Nuclei Aad Res. 5:3157 (1978). Using the above methods, a -footprint" could be produced 
using a single array of unimolecu.ar, double-stranded oligonucleotides in a fraction of the time 5 conventional EE? 

double-stranded DNA. Phosphorimagmg or fluorescence detection will provide a footprint of those regions on the library 

frSS £ T h " Wi " a " be label6d " Hh 3 marker ' a f luoresce "t T '"corporation of 

!™J 22?, r S 6 ^ 03,1 be ° Ut b/ termina,in9 *« o'^leotide synthesis Z a com 

meraally ava. able fluorescing phosphoramidrte nucleotide derive. Following incubation with the unlabeled protein 
the hbrary w,ll be treated with DNase I and examined for areas which are protected from cleavage 

The assay methods described above for the libraries of the present invention can also be used in reverse druo 
S S 8 " aP 2 Cati ° n ' 3 ^ lw "" n P*™^ safety or other des^p^trSs (e? 

mSL^T, 89 8 Vanet > ° f douWe - stranded oligonucleotides for potential binding, I°*e compound 

In other embodiments, probe arrays comprising p-turn mimetics can be prepared and assayed for activity aoainst 
a particular receptor, p-turn mimetics are compounds having molecular structures similar to p-turns whTh ^re onTof 
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the three major components in protein molecular architecture, p-turns are similar in concept to hairpin turns of oligonu- 
cleotide strands, and are often critical recognition features for various protein-ligand and protein-protein interactions. As 
a result, a library of p-turn mimetic probes can provide or suggest new therapeutic agents having a particular affinity for 
a receptor which will correspond to the affinity exhibited by the p-turn and its receptor. 

XII. Bioetectronic Devices and Methods 

In another aspect, the present invention provides a method for the bioelectronic detection of sequence-specific 
oligonucleotide hybridization. A general method and device which is useful in diagnostics in which a biochemical species 
is attached to the surface of a sensor is described in U.S. Patent No. 4,562,157 (the Lowe patent), incorporated herein 
by reference. The present method utilizes arrays of immobilized oligonucleotides (prepared, for example, using VLSIPS™ 
technology) and the known photo-induced electron transfer which is mediated by a DNA double helix structure. See. 
Murphy, era/., Science 262:1 025-1 029 (1993). This method is useful in hybridization-based diagnostics, as a replace- 
ment for fluorescence-based detection systems. The method of bioelectronic detection also offers higher resolution and 
potentially higher sensitivity than earlier diagnostic methods involving sequencing/detecting by hybridization. As a result, 
this method finds applications in genetic mutation screening and primary sequencing of oligonucleotides. The method 
can also be used for Sequencing By Hybridization (SBH), which is described in co-pending Application Ser. Nos. 
08/082,937 (filed June 25, 1993) and 08/168,904 (filed December 15, 1993), each of which are incorporated herein by 
reference for all purposes. This method uses a set of short oligonucleotide probes of defined sequence to search for 
complementary sequences on a longer target strand of DNA. The hybridization pattern is used to reconstruct the target 
DNA sequence. Thus, the hybridization analysis of large numbers of probes can he used to sequence long stretches of 
DNA. In immediate applications of this hybridization methodology, a small number of probes can be used to interrogate 
local DNA sequence. 

In the present inventive method, hybridization is monitored using bioelectronic detection. In this method, the target 
DNA, or first oligonucleotide, is provided with an electron-donor tag and then incubated with an array of oligonucleotide 
probes, each of which bears an electron-acceptor tag and occupies a known position on the surface of the array After 
hybridization of the first oligonucleotide to the array has occurred, the hybridized array is illuminated to induce an electron 
transfer reaction in the direction of the surface of the array. The electron transfer reaction is then detected at the location 
on the surface where hybridization has taken place. Typically, each of the oligonucleotide probes in an array will have 
an attached electron-acceptor tag located near the surface of the solid support used in preparation of the array. In 
embodiments in which the arrays are prepared by light-directed methods {i.e, typically 3' to 5' direction), the electron- 
acceptor tag will be located near the 3' position. The electron-acceptor tag can be attached either to the 3' monomer by 
methods known to those of skill in the art, or it can be attached to a spacing group between the 3' monomer and the 
solid support. Such a spacing group will have, in addition to functional groups for attachment to the solid support and 
the oligonucleotide, a third functional group for attachment of the electron-acceptor tag. The target oligonucleotide will 
typically have the electron-donor tag attached at the 3' position. Alternatively, the target oligonucleotide can be incubated 
with the array in the absence of an electron-donor tag. Following incubation, the electron-donor tag can be added in 
solution. The electron-donor tag will then intercalate into those regions where hybridization has occurred. An electron 
transfer reaction can then be detected in those regions having a continuous DNA double helix. 

The electron-donor tag can be any of a variety of complexes which participate in electron transfer reactions and 
which can be attached to an oligonucleotide by a means which does not interfere with the electron transfer reaction. In 
preferred embodiments, the electron<Jonor tag is a ruthenium (II) complex, more preferably a ruthenium (II) 
(phen^dppz) complex. 

The electron-acceptor tag can be any species which, with the electron-donor tag, will participate in an electron 
transfer reaction. An example of an electron-acceptor tag is a rhodium (III) complex. A preferred electron-acceptor tag 
is a rhodium (III) (phi) 2 (phen') complex. 

In a particularly preferred embodiment, the electron-donor tag is a ruthenium (II) (phen') 2 (dppz) complex and the 
electron-acceptor tag is a rhodium (III) (phi) 2 (phen') complex. 

In still another aspect, the present invention provides a device for the bioelectronic detection of sequence-specific 
oligonucleotide hybridization. The device will typically consist of a sensor having a surface to which an array of oligonu- 
cleotides are attached. The oligonucleotides will be attached in pre-defined areas on the surface of the sensor and have 
an electron-acceptor tag attached to each oligonucleotide. The electron-acceptor tag will be a tag which is capable of 
producing an electron transfer signal upon illumination of a hybridized species, when the complementary oligonucleotide 
bears an electron -donating tag. The signal will be in the direction of the sensor surface and be detected by the sensor. 

In a preferred embodiment, the sensor surface will be a silicon-based surface which can sense the electronic signal 
induced and, if necessary, amplify the signal. Tne metal contacts on which the probes will he synthesized can be treated 
with an oxygen plasma prior to synthesis of the probes to enhance the silane adhesion and concentration on the surface. 
The surface will further comprise a multi-gated field effect transistor, with each gate serving as a sensor and different 
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oligonucleotides attached to each gate. The oligonucleotides will typically be attached to the meta! contacts on the sensor 
surface by means of a spacer group. 

The spacer group should not be too long, in order to ensure that the sensing function of the device is easily activated 
by the binding interaction and subsequent illumination of the lagged" hybridized oligonucleotides. Preferably, the spacer 
group is from 3 to 12 atoms in length and will be as described above for the surface modifying portion of the spacer 
group, L 1 . 

The oligonucleotides which are attached to the spacer group can be formed by any of the solid phase techniques 
which are known to those of skill in the art. Preferably, the oligonucleotides are formed one base at a time in the direction 
of the 3' terminus to the 5* terminus by the "light-directed" methods described above The oligonucleotide can then be 
modified at the 3* end to attach the electron -acceptor tag. A number of suitable methods of attachment are known. For 
example, modification with the reagent Aminolink2 (from Applied Biosystems, Inc.) provides a terminal phosphate moiety 
which is derivatized with an aminohexyl phosphate ester. Coupling of a carboxylic acid, which is present on the electron- 
acceptor tag, to the amine can then be carried out using HOBT and DCC. Alternatively, synthesis of the oligonucleotide 
can begin with a suitably derivatized and protected monomer which can then he deprotected and coupled to the electron- 
acceptor tag once the complete oligonucleotide has been synthesized. 

The silica surface can also be replaced by silicon nitride or oxynitride, or by an oxide of another metal, especially 
aluminum, titanium (IV) or iron (III). The surface can also be any other film, membrane, insulator or semiconductor 
overlying the sensor which will not interfere with the detection of electron transfer detection and to which an oligonucle- 
otide can be coupled. 

Additionally, detection devices other than an FET can be used. For example, sensors such as bipolar transistors, 
MOS transistors and the like are also useful for the detection of electron transfer signals. 

XIII. Alternative Embodiments 

A. Adhesives 

In still another aspect, the present invention provides an adhesive comprising a pair of surfaces, each having a 
plurality of attached oligonucleotides, wherein the single-stranded oligonucleotides on one surface are complementary 
to the single-stranded oligonucleotides on the other surface. The strength and position/orientation specificity can be 
controlled using a number of factors including the number and length of oligonucleotides on each surface, the degree 
of complementarity, and the spatial arrangement of complementary oligonucleotides on the surface. For example, 
increasing the number and length of the oligonucleotides on each surface will provide a stronger adhesive. Suitable 
lengths of oligonucleotides are typically from about 10 to about 70 nucleotides. Additionally, the surfaces of oligonucle- 
otides can be prepared such that adhesion occurs in an extremely position-specific manner by a suitable arrangement 
of complementary oligonucleotides in a specific pattern. Small deviations from the optimum spatial arrangement are 
energetically unfavorable as many hybridization bonds must be broken and are not reformed in any other relative orien- 
tation. 

The adhesives of the present invention will find use in numerous applications. Generally, the adhesives are useful 
for adhering two surfaces to one another. More specifically, the adhesives will find application where biological compat- 
ibility of the adhesive is desired. An example of a biological application involves use in surgical procedures where tissues 
must be held in fixed positions during or following the procedure. In this application, the surfaces of the adhesive will 
typically be membranes which are compatible with the tissues to which they are attached. 

A particular advantage of the adhesives of the present invention is that when they are formed in an orientation 
specific manner, the adhesive portions will be "self-finding," that is the system will go to the thermodynamic equilibrium 
in which the two sides are matched in the predetermined, orientation specific manner. 

a Methods For Preparing Single-Stranded Nucleic Acid Sequences 

In a further embodiment, the present invention provides a method of using a chip, i.e., an array, of oligonucleotides 
to direct the synthesis of long, single-stranded nucleic acid sequences. More particularly, the present invention provides 
a method of directing the synthesis of a single-stranded nucleic acid sequence, the method comprising: (a) forming a 
hybrid complex by combining at least two oligonucleotides which are phosphorylated at their 5' ends with a chip-bound 
oligonucleotide, the chip-bound oligonucleotide having subsequences which are complementary to a subsequence of 
each of the oligonucleotides; (b) contacting the hybrid complex with a ligase to form a ligated oligonucleotide; and (c) 
releasing the ligated oligonucleotide from the chip-bound oligonucleotide to form a single-stranded nucleic acid 
sequence. 

The foregoing method is illustrated in FIG. 17A. As shown in FIG. 17A, the joining of Oligo 1 (0^ and Oligo 2 (0 2 ) 
is directed by a chip-bound oligonucleotide having subsequences which are complementary to the ends of 0 1 and 0 2 . 
The oligonucleotides, e.g. t and 0 2 , are typically greater than 20 nucleotides in length and they are phosphorylated 
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at their 5* ends. Any enzyme that catalyzes the formation of a phosphodiester bond at the site of a single-strand break 
in duplex DNA can be used in this method of the present invention. Such ligases include, but are not limited to, T4 DNA 
ligase, ligases isolated from £ co//and ligases isolated from other bacteriophages. In a presently preferred embodiment, 
T4 DNA ligase is the ligase used. The concentration of the ligase will vary depending on the particular ligase used, the 
concentration of oligonucleotides and buffer conditions, but will typically range from about 500 units/ml to about 5,000 
units/ml. Moreover, the time in which the hybrid complex is in contact with the ligase will vary. Typically, the ligase treat- 
ment is carried out for a period of time ranging from minutes to hundreds of hours. 

It will be readily apparent to those of skill in the art that using the method of the present invention, multiple oligonu- 
cleotides, e.g., Oligos C^-O* can be joined together by a series of ligation reactions directed by the chip^bound oligo- 
nucleotides (See. e.g., FIG. 17B). After each ligation step, the temperature needs to be raised and/or the salt 
concentration reduced to allow the ligated oligonucleotide to be released from the surface. Many cycles of hybridization, 
ligation and heating will be necessary for complete synthesis. However, only a small amount of the full-length product 
needs to be synthesized as it can be amplified using PCR subsequent to the ligation steps. 

Moreover, it will be readily apparent to those of skill in the art that the chip can consist of a wide variety of oligonu- 
cleotides that would allow a large number of different single-stranded nucleic acid sequences to be constructed. The 
chip can have virtually any number of different oligonucleotides, and will be limited only by the number or variety of 
single-stranded nucleic acid sequences desired and by the synthetic capabilities of the practitioner. In one group of 
embodiments, the chip will have from 1 up to 1 00 members. In other groups of embodiments, the chip will have between 
100 and 1,0000 members, and between 10,000 and 1000000 members. In preferred embodiments, the chip will have 
a density of more than 100 members at known locations per cm 2 , preferably more than 1 ,000 per cm 2 , more preferably 
more than 10,000 per cm 2 . 

In addition to the foregoing, site<lirected "mutant" sequences can be made by using "mutated" Oj oligonucleotides. 
If the mutation is at an internal position of O i( the same chip-bound oligonucleotides are appropriate for the ligation steps. 
If. however, the mutation is near a junction, different chip-bound oligonucleotides will be required. The chip can consist 
of a wide variety of oligonucleotides that would allow a large number of different sequences to be constructed. Moreover, 
shuffled genes (Oj in a different order) can also be made using a different chip that encodes for a different set of junctions! 
In addition, a family of mutant genes can be made by using pools of oligonucleotides in solution and a chip that contains 
templates for all possible, correctly ordered junctions. 

In another embodiment, the oligonucleotides, /.a, O jt can be synthesized on a chip and selectively released into 
solution. This embodiment can be carried out using a photo-labile linker (See, FIG. 17C). Any gene or mutant gene can 
be synthesized by selectively releasing the desired oligonucleotides into solution prior to the series of ligation reactions. 
This would provide an incredibly diverse mutant-generation capacity, with the specific synthetic product(s) determined 
by the irradiation steps used to release the specific set of oligos (and the junctions encoded by the chip). A mutant 
sequence or, alternatively, a family of mutant sequences could be simply selected by the choice of photolysis steps that 
produce the desired reactant oligos. In this embodiment, it is best if the photolysis wavelength of the photolabile linker 
is different from the wavelength used to remove the MENPOC group during synthesis. Moreover, the photolysis wave- 
length must also be compatible with phosphoramidite synthesis steps. Such photolabile linkers include, but are not 
limited to. ortho-nitrobenzyl groups and derivatives thereof. 

XIV. Examples 

The following examples are provided to illustrate the efficacy of the inventions herein. 
A ENHANCED DISCRIMINATION USING RNase A 

This example illustrates the ability of RNase A to recognize and cut single-stranded RNA, including RNA in DNA:RNA 
hybrids that is not in a perfect double-stranded structure. RNA bulges, loops, and even single base mismatches can, for 
example, be recognized and cleaved by RNase A. RNase A treatment is used herein to improve the quality of RNA 
hybridization signals on high density oligonucleotide arrays. 

EXAMPLE I 

The high density array of oligonucleotide probes on a glass substrate (referred to as a "chip") is prepared using the 
standard VLSIPS protocols set forth above. Moreover, the pattern of oligonucleotide probes is based on the standard 
tiling strategy described shown in Fig. 5. Briefly, the chip used in this example consists of an overlapping set of DNA 1 5- 
mers covalently linked to a glass surface. A set of four probes for each nucleotide of a 1.3 kb region spanning the D- 
loop region of human mitochondrial DNA (mtDNA) is present on the substrate. Each of the four probes contains a different 
base (A. C, G or T) at the position being interrogated, with the substitution position being near the center of the probe. 
Because the probes are specifically selected based on the mtDNA target sequence, one of the four probes will be 
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perfectly complementary to the mtDNA target, and the other three will contain a central base-pairing mismatch. The 
mismatch probes are expected to hybridize to a lesser extent. By incorporating f luorophores into the target DNA or RNA, 
the extent of hybridization at the four positions for each base can be quantitated using fluorescence imaging. In principle! 
the correct target base is simply identified as the complement to the probe base giving rise to the largest hybridization 
signal. 

Generally, a "base identification" is considered to be made if the signal in one of the four probe regions is greater 
than twice as large as the signal in a nearby region that contains no oligonucleotide probes (referred to herein as the 
"background"), and if the signal is at least 1.2 times as large as in the other three related probe regions on the chip. If 
the signal in more than one of the probe regions is larger than twice the background, but is not greater than the other 
three by at least a factor of 1 .2, then a "multiple-base ambiguity" is indicated. For example, rf the T-containing and the 
C-containing probes have high but similar hybridization signals, a two-base ambiguity would result (a call of either the 
complementary bases A or G could be made). All two-base ambiguities are possible, as well as all 3- and 4-base ambi- 
guities. If the most intense hybridization signal (largest by at least a factor of 1 .2) is in the region that is not complementary 
to the target sequence, then an "incorrect call" is made (referred to herein as a "miscall"). As shown below, the RNase 
A treatment resolves multiple-base ambiguities and reduces the number of miscalls that result from hybridization of a 
1 .3 kb RNA target to the mitochondrial probe chip described above. 

Labelled mitochondrial RNA samples are prepared using standard PCR and in vitro transcription procedures. The 
1 .3 kb RNA sample is labelled by incorporation of fluorescein-labelled UTP during transcription (approximately 10% of 
Us in the RNA sample are labelled). The RNA (approximately 200 nM concentration of 1.3 kb transcripts) is partially 
fragmented by heating to 99.9°C for 60 minutes in 6 mM magnesium chloride, pH 8. This procedure produces a wide 
range of fragment lengths, with an average length of approximately 200 nucleotides. After fragmentation, the RNA sample 
is diluted to 10 nM in 60 mM sodium phosphate, 0.9 M NaCI, 6 mM EDTA, 0.05% Triton X-100, pH 7.9 (referred to as 
6XSSPE-T). For hybridization, 1 0 mM CTAB (cetyltrimethylammonium bromide) is added. The RNA sample is hybridized 
to the chip in a 1 ml flow cell at 22°C for 40 minutes with stirring provided by bubbling nitrogen gas through the flow cell. 
Following hybridization, the chip is rinsed with 6XSSPE-T and the fluorescence signal is detected using a scanning 
confocal fluorescence microscope ("reading" the chip) (See, FIG. 6). The image is stored for later analysis. The chip is 
then treated with 75 fil of 0.2 jig/ml RNase A in 6XSSPE-T at 22°C for intervals of 10, 45. and 75 minutes. After each 
interval, the chip is rinsed with 6XSSPE-T and the fluorescence signal is read (See. FIG. 7). The results are analyzed 
to determine the number of correct base calls, multiple-base ambiguities and miscalls, and the improvement resulting 
from the RNase A treatments. 

After the original hybridization, 619 out of 1302 bases were called correctly (approximately 47%). Of the remaining, 
there were 218 miscalls, 458 multiple-base ambiguities, and 1 7 instances where the signal was not more than twice the 
background. (These numbers are subject to the conditions of the experiment.) In particular, they are a function of hybrid- 
ization time and temperature, salt concentration, the presence of Triton X-100 and CTAB, and the extent of RNA frag- 
mentation and labelling. The conditions used here, in particular the limited fragmentation of the RNA, are ones that tend 
to decrease the number of regions with low signal, and to increase the number of miscalls and ambiguities.) Following 
treatment with RNase A (and combining the information for the three time points), 1 62 out of 218 miscalls were corrected, 
and 350 out of 458 ambiguities were correctly resolved. There were only 46 bases that were initially ambiguous which 
were resolved incorrectly, and there were no instances of correct calls that were changed to incorrect calls after RNase 
A treatment. After the initial hybridization, only 47%, of the entire sequence was called correctly. However, when the 
hybridization results are combined with the results following RNase A treatment, approximately 87% of the 1302 bases 
are called correctly. These results clearly demonstrate that RNase A is very effective in improving the quality of the 
sequence information obtained from hybridization to oligonucleotide arrays, 

a ENHANCED DISCRIMINATION USING LIGATION REACTIONS 

The following examples illustrate the ability of ligation reactions to improve discrimination of base-pair mismatches 
near the 5' end of an oligonucleotide probe. The ligation reaction of labelled, short oligonucleotides to the 5' end of 
oligonucleotide probes on a chip should occur (in the presence of the enzyme Ligase) wherever a probertarget hybrid 
has formed with correct base-pairing near the 5' end of the probe and where there is a suitable 3* overhang of the target 
to serve as a template for hybridization and ligation. In the following examples, the ligation reaction is used to improve 
discrimination of base-pair mismatches n ear the 5' end of the probe, /. a , mismatches which are often poorly discriminated 
following hybridization alone. 

Example! 

In this example, a chip is made with probes having the following sequence: P-P-A-A-CGCGCCGCNC-5' wherein: 
P is a polyethlyeneglycol (PEG) spacer. A, C, and G, are the usual deoxynucleotides. and N is either A, C. G. or T The 
chip is made using the standard VLSIPS protocols set forth above. The target oligonucleotide is a 20-mer having the 
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following sequence (listed 5' to 3*): 

F1 -GCGCGGCGCGAACGCAACGC 
wherein: F1 is a fluorescein molecule covalently attached at the 5' end. The labelled, ligatable 6-mer used in this example 
has the following sequence: 

F1-TGCGTT. 

The 5' haH of the 20-mer target is complementary to the probes on the chip for which N is a G. The probe:target hybrids 
for the other three probes have a single base mismatch one base in from the 5' end of the probe. The ligatable 6-mer 
is complementary to the 3' overhang of the target when the target is hybridized to the probe to form the maximum number 
of Watson-Crick hydrogen bonds. 

Prior to hybridization and ligation, the chip is treated with T4 Polynucleotide Kinase in order to phosphorylate the 5' 
end of the probes. The probes are phosphorylated using 100 units of T4 Polynucleotide Kinase (New England Biolabs) 
in 1 ml at 37°C for 90 minutes. 

A 10 nM solution of the target oligo in 6XSSP-T (no EDTA in the hybridization buffer because EDTA could interfere 
with subsequent ligation reactions) is hybridized to the chip for 30 minutes at 22°C. The chip is scanned, and then washed 
with a large amount of water to remove the labelled target molecules. 

The ligation reaction is carried out at 16°C in a 1 ml flow cell containing 10 nM target oligo. 20 nM ligatable 6-mer. 
and 4000 units of T4 DNA Ligase (New England Biolabs). The buffer is the buffer recommended by the manufacturer 
plus 150 mM NaCI. The reaction is allowed to proceed for 14 hours at 16°C, after which the chip is vigorously washed 
with water at 50°C to remove the labelled target molecules. The only fluorescent label remaining after washing is that 
of the ligatable 6-mers that have been covalently attached to the probes via the ligation reaction. The chip is scanned 
and analyzed, and the results compared to those obtained from the hybridization reaction above. 
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In the above table, N is the base in the probe that is one position in from the 5' end (see, supra). For the target used 
here, G is the complementary base. HYB and LIG are the signals (fluorescence counts) for the different probes following 
hybridization and ligation, respectively. HDF and LDF are the discrimination factors (defined as the ratio of the fluores- 
cence signal with the perfect match, G, to the signal with the specified mismatch base) following hybridization and ligation, 
respectively. 

It is clear that after hybridization, the extent of target hybridization is very similar for the perfectly complementary 
probe and the probes containing a mismatch near the 5' end. The A and C mismatches differ by only 10%, and the 
maximum difference is only 40%. In contrast, following the ligation reaction, the discrimination is greatly improved, with 
the minimum discrimination factor greater than 4. These data indicate that ligation reactions can be performed on cov- 
alently attached oligonucleotide probes on the chip surface, that these reactions are specific for correctly base-paired 
probe:target hybrids, and that the reaction can be used to improve the discrimination between perfect matches and 
single base mismatches. 

EXAMPLE ft 

In this example, a chip was made with probes having the following sequences: 

P-P-A-A-CGCGCATTCN-5' (denoted CG) 

P-P-A-A-ATATAATTCN-5' (denoted AT) 
A, T, C, G and N have the same definitions as those set forth in Example I, supra. These probes contain a perfect match 
and the single-base mismatch sequences for the following 22-mer target oligos (listed 5' to 3*): 

F1-GCGCGTAAGGCCTTCGACGTAG (denoted OH1) 

F1 -TATATTAAGGCCTTCGACGTAG (denoted OH2) 
The 5' end of OH1 is complementary to the CG probes with N = C, and the 5' end of OH2 is complementary to the AT 
probes with N = C. Both OH1 and OH2 have the same 12-mer sequence at the 3' end. The labelled, ligatable 6-mer 
used in this example (appropriate for both OH1 and OH2 when hybridized to the CG and AT regions of the chip, respec- 
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lively) has the following sequence: 
F1-CGAAGG (denoted L6B). 
Prior to hybridization and ligation, the chip is phosphorylated as in Example I, supra, using T4 polynucleotide kinase 
for 4 hours at 37°C. The hybridization and ligation conditions are the same as those used in Example I unless otherwise 
5 specified. In particular, 2000 units of T4 DNA Ligase are used for the reaction here, and the concentration of the ligatable 
6-mer is 10 nM rather than 20 nM. 

The hybrids between OH1 and the CG probes on the chip contain a high proportion of C-G base pairs. C-G base 
pairs are known to be considerably more stable than the A-T base pairs that are predominant in the hybrid between 
OH2 and the AT probes on the chip. TJius, it is expected that OH1 will hybridize to its perfectly complimentary probe 
to oligo to a greater extent than will OH2 under suitably stringent hybridization conditions. In fact, this is observed to be 
the case in the hybridization experiments below. The ligation reaction, however, can be used to help mitigate the com- 
plicating effects of the base composition dependence of hybridization. 

The chip was initially hybridized with both OH1 and OH2 at 22°C for 30 minutes. The extent of hybridization to both 
the CG and AT regions of the chip is analyzed. It is found that the fluorescence signal in the CG regions (OH1 hybrids) 
is is larger than in the AT regions (OH2 hybrids) by more than a factor of 14. In fact, the perfect match signal in the CG 
region is quite strong, but the signal in the AT region is only slightly greater than twice the background. 
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* These values are somewhat uncer- 
30 tain because the signal is not large rel- 

ative to the background. 



Following hybridization, the chip was washed extensively with water to remove the target molecules. A ligation 
35 reaction is initiated on the chip by combining OH1, OH2, and L6B in 1 ml of ligation buffer and adding 2000 units of T4 
DNA Ugase. The reaction is allowed to proceed for 34 hours at 22°C, and then for another 24 hours at 8°C. At each 
stage, the chip is read and the data recorded and analyzed. 
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It is striking that after the ligation reaction at 8°C, the signals for OH1 and OH2 differ by only a factor of 1.4, ten 
55 times less than the factor of 14 that was observed following the original hybridization. It is wen more striking that the 
composition dependence is mitigated by virtue of the ligation reaction at low temperature with no loss of discrimination 
for either OH1 orOH2. 
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In order for the ligation strategy to be useful for unknown or more complex DNA targets, it is necessary to use a 
pool of all possible (4096) 6-mers instead of a specific ligatable 6-mer. The 4096 6-mers are synthesized using standard 
5 phosphoramidite chemical procedures on four separate columns, one beginning (at the 3' end) with A, one with C, one 
with G ( and one with T. Each of the 5 subsequent synthesis steps are performed using a mixture of A, C, G, and T 
phosphoramidite, producing a mixture of all possible five base sequences on each of the four columns. The 6-mers are 
labelled with fluorescein at the 5' end as the last step in the synthesis. After reversed-phase HPLC purification of the 
four 6-mer pools, the concentration of each pool is determined by the absorption at 260 nm. The appropriate amounts 
10 of each pool is mixed to make a solution that contains all 4096 labelled 6-mer oligonucleotides. 
A chip is made containing 10-mer probes having the following sequences 
P-P-C-G-C-G-N! -Ng-Ns-N^Ns-Ng-S* 
wherein: Nj are A, C, G, or T. In other words, the chip contains 10-mers with all possible (4096) six base combinations 
at the 5* end. The 5' phosphate group on the probes required for ligation is added chemically (using 5' Phosphate-ON, 
is Clontech Laboratories, Palo Alto, CA) as the last step in the synthesis of the chip, prior to deprotection of the bases! 
The target oligo is a 22-mer having the following sequence (listed 5' to 3*): 
F1 -GCGCGTAAGGCCTTCGACGTAG (OH1) 

The chip was initially hybridized with 1 0 nM OH 1 in 6XSSP-T at 22°C for 30 minutes. The chip is read and analyzed. 
The only perfect match probe for this target (/le., PP-CGCGCATTCC-5') has the second highest hybridization signal. 

20 Eight other probes have hybridization signals that are within a factor of 4 of the perfect match signal. The other three 
probes with a single base mismatch at the 5' end have discrimination factors of 2.0, 2.6, and 3.5, for G, A, and T, respec- 
tively. Other single base mismatches at positions in from the 5' end of the probe give signals that are considerably smaller. 
The chip is washed with water to remove the hybridized target. 

The chip is next hybridized using the conditions used for the ligation reaction. The chip is hybridized with 10 nM 

25 OH1 and 1.6 nM 6-mer pool (0.4 nM for each 6-mer oligo) in the ligation buffer for 1 1 hours at 22°C (no ligase at this 
stage). The perfect match probe gives the highest signal by a factor of 2.4. Five probes have signals within a factor of 
4 of the perfect match signal. The other three probes with a single base mismatch at the 5' end have discrimination 
factors of 3.0, 3,6. and 8.0, for G, A, and T, respectively. 

The ligation reaction is initiated by the addition of 2000 units of T4 DNA ligase to the solution containing OH1 and 

30 the pool of 6-mers. The reaction is allowed to proceed for 23 hours at 22°C. After washing the chip with water at about 
45°C for five minutes, the chip is read. After ligation, no other probes have hybridization signals that are within a factor 
of 4 of the perfect match signal. The three 5' single base mismatch probes all have discrimination factors greater than 
1 2. Thus, with a complex chip containing 4096 probes with all possible 6-mer sequences at the 5' end, and using a pool 
of all possible ligatable 6-mers, the ligation reaction is still specific for the perfectly complementary probe and affords 

35 considerable increases in the discrimination between perfect matches and single-base mismatches. 

EXAMPLE IV 

In this example, a chip was made using the tiling strategy (A, C, G, T -containing probes for each base in the 
40 sequence) described above that covers a 50 base region of the protease gene of HIV-1 (SF2 strain). The probes are 
1 1 -mers, linked to the glass support by three PEG linkers. The substitution position (the position being interrogated by 
an A, C, G, or T base in the probe) is varied between the 5' end of the probe, and five bases in from the 5' end (referred 
to as positions end, -1,-2, -3, -4 and -5). The chip is synthesized using standard VLSIPS protocols. Prior to hybridization 
and ligation, the chip is phosphorylated using T4 polynucleotide kinase for 5 hours at 37°C. The target is a 75-mer 
45 oligonucleotide (denoted Hprol), labelled at the 5' end with fluorescein, that spans the complementary 50 base region 
on the chip. 

Thechipwas initially hybridized with a 1 0 nM solution of Hprol in 6XSSP-Tat 22°Cfor 30 minutes. After hybridization, 
the chip was read, and then rinsed with water to remove the target molecules. A ligation reaction was then carried out 
with 10 nM Hprol , 1 .6 jiM 6-mer pool (0.4 nM per oligo), and 2000 units of T4 DNA Ligase in 1 ml of ligation buffer. The 

so ligation reaction is allowed to proceed for 25 hours at 8°C, then 90 hours at 22°C, and finally 4 days at 8°C. At intervals 
of 1 to 2 days, the solution is supplemented with additional T4 DNA Ugase. Following the ligation reaction, the chip is 
washed vigorously with water at about 45°C for 10 minutes, leaving only the labelled 6-mers that have been ligated to 
the probe molecules. The chip is read, and the data analyzed. 

The results of the hybridization and ligation reactions are analyzed in terms of the ability to make a correct base 

55 call from the fluorescence signal measured on the chip. In particular, the signal is compared between the four probes 
that differ by a single base at a given position within the 1 1 -mer, with the rest of the 1 1 -mer being perfectly complementary 
to a specific region of the target sequence. For the purposes of this experiment, a base identification is said to be made 
if the signal in at least one of the four probe regions is greater than the signal in a nearby region that has no oligonucleotide 
probes (the background) by at least 5 counts (the background counts are usually about 2 - 6 counts), and if the signal 
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in one of the four regions is greater than that in the other three related regions by at least a factor of 1 .2. If none of the 
four signals are larger than the other three by a factor of at least 1 .2, a multiple base ambiguity results. If the most intense 
hybridization signal (by a factor of at least 1 .2) is for a probe that is not perfectly complementary to the target sequence 
then a miscall results. 

Following hybridization, the 1 1 -mer probes with substitution positions -1,-2, -3, and -4 all gave 49 correct base calls 
and 1 multiple base ambiguity. The probe with substitution position -5 resulted in 50 correct base calls. Following ligation 
the probes with substitution positions -2 and -5 gave 48 correct calls and 2 miscalls, substitution position -3 yielded 48 
correct calls and 1 ambiguity and 1 miscall, and substitution position -1 and -4 both yielded 50 correct calls with no 
ambiguities or miscalls. These results indicate that the ligation reaction with the full pool of 6-mers can be used to 
speaf ically label hybrids between relatively complex targets and arrays of oligonucleotide probes. 

It is interesting to note that the pattern of ligation (stronger or weaker signals, better or worse discrimination) is not 
in general the same as the pattern of hybridization. This suggests that these two approaches may be used as comple- 
mentary tools to obtain sequence information with arrays of oligonucleotide probes. For example, probes that produce 
large hybridization signals, but are poorly discriminated may be better treated using a ligation step. And probes that do 
not hybridize well to a particular complementary target (leading to a signal that is too small relative to the background) 
may hgate well enough to be clearly detected (as also suggested by the mitigation of the base composition dependence 
demonstrated in Example II, supra). 

C. PREPARATION OF UNIMOLECULAR, DOUBLE-STRANDED OLIGONUCLEOTIDES 
EXAMPLE I 



This example illustrates the general synthesis of an array of unimolecular, double-stranded oligonucleotides on a 
solid support. 

Unimolecular double stranded DNA molecules were synthesized on a solid support using standard light-directed 
methods (VLSIPS™ protocols). Two hexaethylene glycol (PEG) linkers were used to covalently attach the synthesized 
oligonucleotides to the derivatized glass surface. Synthesis of the first (inner) strand proceeded one nucleotide at a time 
using repeated cycles of photOKieprotection and chemical coupling of protected nucleotides. The nucleotides each had 
a protecting group on the base portion of the monomer as well as a photolabile MeNPoc protecting group on the 5' 
hydroxyl. Upon completion of the inner strand, another MeNPoc-protected PEG linker was covalently attached to the 5' 
end of the surface-bound oligonucleotide. After addition of the internal PEG linker, the PEG is photodeprotected, and 
the synthesis of the second strand proceeded in the normal fashion. Following the synthesis cycles, the DNA bases 
were deprotected using standard protocols. The sequence of the second (outer) strand, being complementary to that 
of the inner strand, provided molecules with short, hydrogen bonded, unimolecular double-stranded structure as a result 
of the presence of the internal flexible PEG linker. 

An array of 1 6 different molecules were synthesized on a derivatized glass slide in order to determine whether short 
unimolecular DNA structures could be formed on a surface and whether they could adopt structures that are recognized 
by proteins. Each of the 1 6 different molecular species occupies a different physical region on the glass surface so that 
there is a one-to-one correspondence between molecular identity and physical location. The molecules are of the form 
S-P-P-C-C-An--An--An--An--G-C-P-G-C-An--An--AAT-An--G-G-F where S is the solid surface having silyl groups, P is a 
PEG linker, A, C, G, and T are the DNA nucleotides, and F is a fluorescent tag. The DNA sequence is listed from the 3* 
to the 5' end (the 3' end of the DNA molecule is attached to the solid surface via a silyl group and 2 PEG linkers). The 
sixteen molecules synthesized on the solid support differed in the various permutations of A and T in the above formula. 

EXAMPLE II 



This example illustrates the ability of a library of surface-bound, unimolecular, double-stranded oligonucleotides to 
exist in duplex form and to be recognized and bound by a protein. 

A library of 16 different members was prepared as described in Example 1. The 16 molecules ail have the same 
composition (same number of As, Cs, Gs and Ts), but the order is different. Four of the molecules have an outer strand 
that is 100% complementary to the inner strand (these molecules will be referred to as DS, double-stranded, below). 
One of the four DS oligonucleotides has a sequence that is recognized by the restriction enzyme EcoR1 . If the molecule 
can loop back and form a DNA duplex, it should be recognized and cut by the restriction enzyme, thereby releasing the 
fluorescent tag. Thus, the action of the enzyme provided a functional test for DNA structure, and also served to dem- 
onstrate that these structures can be recognized at the surface by proteins. The remaining 12 molecules had outer 
strands that were not complementary to their inner strands (referred to as SS, single-stranded, below). Of these, three 
had an outer strand and three had an inner strand whose sequence was an EcoR1 half-site (the sequence on one strand 
was correct for the enzyme, but the other half was not). The solid support with an array of molecules on the surface is 
referred to as a "chip" for the purposes of the following discussion. The presence of f luorescently labelled molecules on 
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the chip was detected using confocal fluorescence microscopy. The action of various enzymes was determined by mon- 
itoring the change in the amount of fluorescence from the molecules on the chip surface (e.g. "reading" the chip) upon 
treatment with enzymes that can cut the DNA and release the fluorescent tag at the 5' end. 

The three different enzymes used to characterize the structure of the molecules on the chip were: 

1) Mung Bean Nuclease - sequence independent, single-strand specific DNA endonuclease; 

2) DNase I - sequence independent, double-strand specific endonuclease; 

3) EcoFM - restriction endonuclease that recognizes the sequence (5*-3') 

GAATTC in double stranded DNA, and cuts between the G and the first A. Mung Bean Nuclease and EcoR1 were 
obtained from New England Biolabs. and DNase I was obtained from Boehringer Mannheim. Ail enzymes were used at 
a concentration of 200 units per ml_ in the buffer recommended by the manufacturer. The enzymatic reactions were 
performed in a 1 ml. flow cell at 22°C, and were typically allowed to proceed for 90 minutes. 

Upon treatment of the chip with the enzyme EcoFM , the fluorescence signal in the DS EcoR1 region and the 3 SS 
regions with the EcoR1 half-site on the outer strand was reduced by about 10% of its initial value. This reduction was 
at least 5 times greater than for the other regions of the chip, indicating that the action of the enzyme is sequence specific 
on the chip. It was not possible to determine if the factor is greater than 5 in these preliminary experiments because of 
uncertainty in the constancy of the fluorescence background. However, because the purpose of these early experiments 
was to determine whether unimolecular double-stranded structures could be formed and whether they could be specif- 
ically recognized by proteins (and not to provide a quantitative measure of enzyme specificity), qualitative differences 
between the different synthesis regions were sufficient. 

The reduction in signal in the 3 SS regions with the EcoR1 half-site on the outer strand indicated either that the 
enzyme cuts single-stranded DNA with a particular sequence, or that these molecules formed a double-stranded struc- 
ture that was recognized by the enzyme. The molecules on the chip surface were at a relatively high density, with an 
average spacing of approximately 100 angstroms. Thus, it was possible for the outer strand of one molecule to form a 
double-stranded structure with the outer strand of a neighboring molecule. In the ease of the 3 SS regions with the 
EcoR1 half-site on the outer strand, such a bimolecular double-stranded region would have the correct sequence and 
structure to be recognized by EcoR1 . However, it would differ from the unimolecular double-stranded molecules in that 
the inner strand remains single-stranded and thus amenable to cleavage by a single-strand specific endonuclease such 
as Mung Bean Nuclease. Therefore, it was possible to distinguish unimolecular from bimolecular double-stranded DNA 
molecules on the surface by their ability to be cut by single and double-strand specific endonucleases. 

In order to remove all molecules that have single-stranded structures and to identify unimolecular double-stranded 
molecules, the chip was first exhaustively treated with Mung Bean Nuclease. The reduction in the fluorescence signal 
was greater by about a factor of 2 for the SS regions of the chip, including those with the EcoR1 half-site on the outer 
strand that were cleaved by EcoR1 , than for the 4 DS regions. Following Mung Bean Nuclease treatment, the chip was 
treated with either DNase I (which cuts all remaining double-stranded molecules) or EcoR1 (which should cut only the 
remaining double-stranded molecules with the correct sequence). Upon treatment with DNase I, the fluorescence signal 
in the 4 DS regions was reduced by at least 5-fold more than the signal in the SS regions. Upon EcoR1 treatment, the 
signal in the single DS region with the correct EcoR1 sequence was reduced by at least a factor of 3 more than the 
signal in any other region on the chip. Taken together, these results indicated that the surface-bound molecules synthe- 
sized with two complementary strands separated by a flexible PEG linker form intramolecular double-stranded structures 
that were resistant to a single-strand specific endonuclease and were recognized by both a double-strand specific endo- 
nuclease, and a sequence-specific restriction enzyme. 

EXAMPLE III 

This example illustrates the strategy employed for the preparation of a conformationally restricted hexapeptide. 

A glass coverslip having aminopropylsilane spacer groups can be further derivatized on the amino groups with a 
poly-A oligonucleotide comprising nine adenosine monomers using VLSIPS™ ("light-directed") methods. The tenth ade- 
nine monomer to be added will be a S'-aminopropyl-functionalized phosphoramidite (available from Glen Research or 
Genosys Biotechnologies). To the amine terminus is then added, in stepwise fashion, the hexapeptide, RQFKWT, begin- 
ning with the carboxyl end of the peptide (i.e., as T-V-V-K-F-Q-R). A 3*-succinylated nucleoside can then be added under 
peptide coupling conditions and the nucleotide synthesis of the poly-T tail can be continued to provide a conformationally 
restricted probe. 

It is to be understood that the above description is intended to be illustrative and not restrictive. Many embodiments 
will be apparent to those of skill in the art upon reviewing the above description. The scope of the invention should, 
therefore, be determined not with reference to the above description, but should instead be determined with reference 
to the appended claims, along with the full scope of equivalents to which such claims are entitled. 
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"me present invention provides greatly improved methods and apparatus for the study of nucleotide sequences and 
nucleic acid interactions with other molecules. It is to be understood that the above description is intended to be ill ustrative 
and not restrictive. Many embodiments and variations of the invention will become apparent to those of skill in the art 
upon review of this disclosure. Merely by way of example, certain of the embodiments described herein will be applicable 
to other polymers, such as peptides and proteins, and can utilized other synthesis techniques. The scope of the invention 
should, therefore, be determined not with reference to the above description, but instead should be determined with 
reference to the appended claims along with the full scope of equivalents to which such claims are entitled. 



SEQUENCE LISTING 



(J.) GENERAL INFORMATION: 

(i) APPLICANT: 

(A) NAME: Affymax Technologies, N.V, 

(B) STREET: De Reyderkade 62 

(C) CITY: Curacao 

(D) STATE: 

(E) COUNTRY : Netherlands Antilles 

(F) POSTAL CODE (ZIP) : 

(G) TELEPHONE: 

(H) TELEFAX: 

(I) TELEX: 

(ii) TITLE OF INVENTION: Methods of Enzymatic Discrimination 
Enhancement and Surface-Bound Double-Stranded DNA 

(iii) NUMBER OF SEQUENCES: 42 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Hepworth Lawrence Bryer & Bizley 

(B) STREET: Merlin House, Falconry Court, Baker's Lane, 

(C) TOWN: Epping 

(D) COUNTY: Essex 

(E) COUNTRY: UK 

(F) POST CODE: CM16 5DQ 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1,30 

(vi) CURRENT APPLICATION DATA: . v 

(A) APPLICATION NUMBER: EP 95 307501.7 

(B) FILING DATE: 20-OCT-1995 

(C) CLASSIFICATION: 
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(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/327,522 

(B) FILING DATE: 21-OCT-1994 

(Vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/327,687 

(B) FILING DATE: 24-OCT-1994 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/533,582 

(B) FILING DATE: 18-OCT-1995 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Richard Edward Biz ley 

(B) REFERENCE/ DOCKET NUMBER: APEP95996 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: +44 1992 561756 

(B) TELEFAX: +44 1992 561934 



(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 
AGCCTAGCTG AA 



(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 
TTCAGCTAGG CT 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
TTTTTAAAAA 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 
AAAAATTTTT 



(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
AAAGAAAAAA GACAGTACTA AATGGA 



(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
AGTACTGTNT TTTTT 



(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
TAGTACTGNC TTTTT 



(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
TTAGTACTNG CTTTT 



(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
CTGTATCCGA CATCTGGTTA A 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
CCAACCAAAC CCC 



(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
CCAACCAAAM NMM 



(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
ACTGTTAGCT AATTGG 



(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
GGGGGGAGCT AACGGG 



(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
TACTGTATTT TTT 



(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15; 
TACTGTCTTT TTT 13 

(2) INFORMATION FOR SEQ ID NO: 16: 



(i) SEQUENCE CHARACTERISTICS: 
w (A) LENGTH: 13 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
TACTGTGTTT TTT 13 

(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
TACTGTTTTT TTT 13 

(2) INFORMATION FOR SEQ ID NO: 18: 

35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

40 (ii) MOLECULE TYPE: DNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
GTACTGACTT TTT 13 

45 

(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 13 base pairs 
so (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19; 
GTACTGCCTT TTT 13 

(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 
GTACTGGCTT TTT 13 

20 (2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 
25 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

30 

GTACTGTCTT TTT 13 

(2) INFORMATION FOR SEQ ID NO: 22: 

35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

40 (ii) MOLECULE TYPE: DNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 
AGTACTATCT TTT 13 



(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 
5 AGTACTCTCT TTT 13 

(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 
w (A) LENGTH: 13 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 
AGTACTGTCT TTT 13 

(2) INFORMATION FOR SEQ ID NO: 25: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
25 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:25: 
AGTACTTTCT TTT 13 

(2) INFORMATION FOR SEQ ID NO: 26: 

35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

40 (ii) MOLECULE TYPE: DNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 
GGGNCCCTTA A 11 

(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 
5 TAAAGTAAGA CATAAC 16 

(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 16 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNE5S : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

15 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 
GGCTGACGTC AGCAAT 16 

20 (2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 
TTGCTGACAT CAGCC 15 

(2) INFORMATION FOR SEQ ID NO: 30: 

35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

& (ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 
TTGCTGACCT CAGCC 15 

45 

(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 15 base pairs 
50 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



30 



(ii) MOLECULE TYPE: DNA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 
TTGCTGACGT CAGCC 15 

(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 
TTGCTGACTT CAGCC 15 

(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA 



(ix) FEATURE: 

(A) NAME /KEY: modif ied_base 

(B) LOCATION: 12 

(D) OTHER INFORMATION: /mod_base= OTHER 

/note= "N = adenine covalently modified 
at the 3' hydroxyl group with 2 
polyethylene glycol (PEG) spacers" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:33: 
CNCGCCGCGC AN 12 



(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(ix) FEATURE: 

(A) NAME/KEY: modif ied_base 

(B) LOCATION: 1 

(D) OTHER INFORMATION: /mod_base= OTHER 

/note= "N = guanine covalently modified 



49 



EP0 721 016 A2 



at the 5' hydroxy A, qraWp witti a 
fluorescein molecule" 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 
NCGCGGCGCG AACGCAACGC 20 

(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(ix) FEATURE: 

(A) NAME/KEY: modif ied_base 

(B) LOCATION: 12 

(D) OTHER INFORMATION: /mod_base= OTHER 

/note= "N = adenine covalently modified 
at the 3' hydroxy 1 group with 2 
polyethylene glycol (PEG) spacers" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 
NCTTACGCGC AN 12 

(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(ix) FEATURE: 

(A) NAME/KEY: modif ied_base 

(B) LOCATION: 12 

(D) OTHER INFORMATION: /mod_base= OTHER 

/note= "N = adenine covalently modified 
at the 3' hydroxy 1 group with 2 
polyethylene glycol (PEG) spacers" 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 
NCTTAATATA AN 12 

(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

5 

(ii) MOLECULE TYPE: DNA 



10 



(ix) FEATURE: 

(A) NAME /KEY: modif ied_base 

(B) LOCATION: 1 

(D) OTHER INFORMATION: /mod_base= OTHER 

/note= "N ~ guanine covalently modified 
at the 5' hydroxyl group with a 
fluorescein molecule" 

15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 

NCGCGTAAGG CCTTCGACGT AG 22 



(2) INFORMATION FOR SEQ ID NO: 38: 

20 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 
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(ii) MOLECULE TYPE: DNA 



(ix) FEATURE: 

(A) NAME /KEY : modif ied_base 
30 (B) LOCATION: 1 

(D) OTHER INFORMATION: /mod_base= OTHER 

/note= "N = thymine covalently modified 
at the 5' hydroxyl group with a 
fluorescein molecule" 

35 (Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 

NATATTAAGG CCTTCGACGT AG 22 



(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(ix) FEATURE: 

(A) NAME/KEY: modif ied_base 
50 (B) LOCATION: 10 

(D) OTHER INFORMATION: /mod_base= OTHER 



/note= "N = cytosine covalently modified 
at the 3' hydroxyl group with 2 
polyethylene glycol (PEG) spacers" 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 
NNNNNNGCGN 10 

(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(ix) FEATURE: 

(A) NAME/ KEY : modif ied_base 

(B) LOCATION: 10 

(D) OTHER INFORMATION: /mod_base= OTHER 

/note= "N = cytosine covalently modified 
at the 3' hydroxyl group with 2 
polyethylene glycol (PEG) spacers" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 
25 CCTTACGCGN 10 

(2) INFORMATION FOR SEQ ID NO: 41: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 

Arg Gin Phe Lys Val Val Thr 
1 5 



(2) INFORMATION FOR SEQ ID NO: 42: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 7 amino acids 
45 (B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 

Thr Val Val Lys Phe Gin Arg 
1 5 
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Claims 

1 . A method for sequencing a target nucleic acid, said method comprising: 

(a) combining: 

(i) a substrate comprising an array of chemically synthesized and positionally distinguishable oligonucle- 
otides each of which is complementary to a defined subsequence of preselected length; and 

(ii) a target nucleic acid; thereby forming target-oligonucleotide hybrid complexes of complementary sub- 
sequences of known sequence; 

(b) contacting said target-oligonucleotide hybrid complexes with a nuclease; thereby removing target-oligonu- 
cleotide complexes which are not perfectly complementary; and 

(c) determining which of said oligonucleotides have specifically interacted with subsequences in said target 
nucleic acid, to determine the sequence of said target nucleic acid. 

2. The method as recited in claim 1 wherein said target nucleic acid is ribonucleic acid (RNA), optionally said nuclease 
is an RNA nuclease, preferably RNase A. 

3. A method for sequencing a target nucleic acid, said method comprising: 

(a) combining: 

(i) a substrate comprising an array of chemically synthesized and positionally distinguishable oligonucle- 
otides each of which is complementary to a defined subsequence of preselected length; and 

(ii) a target nucleic acid which is longer than each of said probes; thereby forming target-oligonucleotide 
hybrid complexes of complementary subsequences of known sequence with a 3' target overhang; 

(b) contacting said target-oligonucleotide hybrid complexes with a ligase and a labelled, ligatable oligonucleotide 
probe; 

(c) removing unbound target nucleic acid and labelled, unligated oligonucleotide probes; and 

(d) determining which of said oligonucleotides contain said labelled, ligatable oligonucleotide probe as an indi- 
cation of a subsequence which is complementary to a subsequence of said target nucleic acid. 

4. The method as recited in claim 1 or claim 3 wherein said target nucleic acid is deoxyribonucleic acid (DNA). 

5. The method as recited in claim 4 when dependent on claim 1 wherein said nuclease is a DNA nuclease, preferably 
DNA nuclease S1 nuclease or Mung Bean nuclease. 

6. The method as recited in any preceding claim wherein said array of oligonucleotides recognizes substantially all 
possible subsequences of preselected length found in said target nucleic acid. 

7. The method as recited in any preceding claim, wherein each oligonucleotide is of a length between about 6 and 20 
bases, preferably between about 8 and 15 bases. 

8. The method as recited in any preceding claim, wherein said array of oligonucleotides comprises about 1 ,000 different 
oligonucleotides, preferably about 3,000 different oligonucleotides, preferably about 10 4 different oligonucleotides, 
more preferably about 10 5 different oligonucleotides, even more preferably about 10 6 different oligonucleotides. 

9. The method as recited in any one of claims 3, 4 or 6 to 8 wherein said ligase is a member selected from the group 
consisting of T4 DNA ligase, ligases isolated from E. coli and ligases isolated from bacteriophages. 

10. A method for sequencing an unlabeled target oligonucleotide, said method comprising: 

(a) combining: 

(i) a substrate comprising an array of positionally distinguishable oligonucleotide probes each of which has 
a constant region and a variable region, said variable region capable of binding to a defined subsequence 
of preselected length; 
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(ii) a constant oligonucleotide having a sequence which is complementary to said constant region of said 
oligonucleotide probes; 

(iii) a target oligonucleotide to be sequenced; and 

(iv) a ligase, thereby forming target-oligonucleotide hybrid complexes of complementary subsequences of 
known sequence; 

(b) contacting said target oligonucleotide-oligonucleotide probe hybrid complexes with a ligase and a pool of 
labelled, ligatable oligonucleotide probes of a preselected length, said pool of labelled, ligatable oligonucleotide 
probes representing all possible sequences of said preselected length; 

(c) removing unbound target nucleic acid and labelled, unligated oligonucleotide probes; and 

(d) determining which of said oligonucleotide probes contain said labelled, ligatable oligonucleotide probe as 
an indication of a subsequence which is complementary to a subsequence of said target oligonucleotide. 

11. A method for sequencing an unlabeled target oligonucleotide, said method comprising; 

(a) contacting an unlabelled target oligonucleotide with a library of labelled oligonucleotide probes, each of said 
oligonucleotide probes having a known sequence and being attached to a solid support at a known position, to 
hybridize said target oligonucleotide to at least one member of said library of probes, thereby forming a hybrid- 
ized library; 

(b) contacting said hybridized library with a nuclease capable of cleaving double-stranded oligonucleotides to 
release from said hybridized library a portion of said labelled oligonucleotide probes or fragments thereof; and 

(c) identifying said positions of said hybridized library from which labelled probes or fragments thereof have 
been removed, to determine the sequence of said unlabelled target oligonucleotide. 

12. A synthetic unimolecular, double-stranded oligonucleotide library comprising a plurality of different members, each 
member having the formula: 

Y-L 1 -X 1 -L 2 -X 2 

wherein, 

Y is a solid support; 

X 1 and X 2 are a pair of complementary oligonucleotides 
L 1 is a spacer; 

L 2 is a linking group having sufficient length such that X 1 and X 2 form a double-stranded oligonucleotide. 

13. A library in accordance with claim 12, wherein L 2 is a member selected from the group consisting of an alkylene 
group, a polyethyleneglycol group, a polyalcohol group, a polymine group and a polyester group. 

14. A library in accordance with claim 12 or claim 13, wherein X 1 and X 2 are complementary oligonucleotides each 
comprising of from 6 to 30 nucleic acid monomers. 

1 5. A library in accordance with any one of claims 1 2 to 1 4, wherein said solid support is a silica support and L 1 comprises 
an aminoalkylsilane and from 1 to 4 hexaethyleneglycols. 

16. A synthetic unimolecular, double-stranded oligonucleotide library of any one of claims 12 to 15, wherein a portion 
of said double-stranded oligonucleotides formed by X 1 and X 2 further comprise a bulge or a loop. 

1 7. A synthetic unimolecular, double-stranded nucleic acid library of any one of claims 1 2 to 16, wherein each member 
further comprises an identifier tag, said identifier tag identifying the sequence of said unimolecular, double-stranded 
nucleic acid. 

18. A synthetic unimolecular, double-stranded nucleic acid library of any one of claims 12 to 17, wherein said solid 
support comprises a first bead linked to a second bead, wherein the double-stranded nucleic add is attached to the 
first bead and an identifier tag is attached to the second bead. 

1 9. A method of forming a plurality of diverse unimolecular, double-stranded oligonucleotides on a solid support having 
optional spacers, said support comprising a surface with a plurality of preselected regions, said method comprising: 
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(a) forming on each of said preselected regions a different first oligonucleotide, each of said first oligonucleotides 
comprising of from 6 to 30 monomers; 

(b) attaching to the distal end of each of said first oligonucleotides of step 
(a) a linking group; and 

(c) forming on the distal end of each of said linking groups a second oligonucleotide, wherein each of said 
second oligonucleotides is complementary to said first oligonucleotide which is attached within the same prese- 
lected region, and wherein said linking groups have sufficient length such that said first and second oligonucle- 
otides form a unimolecular, double-stranded oligonucleotide. 

20. A method of screening a sample for a species capable of binding to double-stranded DNA comprising: 

contacting said sample with a solid support comprising unimolecular, double-stranded DNA attached thereon, 
each of said attached DNA independently having the formula; 

-X 11 -L-X 12 

wherein, 

X 11 and X 12 are complementary oligonucleotides; and 

L is a linking group having sufficient length such that X 11 and X 12 form said attached unimolecular, double- 
stranded DNA, to produce at least one bound pair comprising said species and one of said attached unimolecular, 
double-stranded DNA; and 

identifying said bound pair. 

21. A method in accordance with claim 20, wherein said species is a member selected from the group consisting of a 
drug, a protein and an RNA molecule. 

22. A method of screening a sample for a species capable of binding to double-stranded DNA comprising: 

contacting said sample with a solid support comprising a unimolecular, double-stranded DNA attached ther- 
eon, said attached DNA having the formula; 

-X 11 -L-X 12 

wherein, 

X 11 and X 12 are complementary oligonucleotides; and 

L is a linking group having sufficient length such that X 11 and X 12 form said attached unimolecular, double- 
stranded DNA, to produce a bound pair comprising said species and said attached unimolecular, double-stranded 
DNA; and 

identifying said bound pair. 

23. A synthetic confbrmationally-restricted probe library comprising a plurality of members, each of said members com- 
prising a solid support attached to an oligomer having the formula: 

-X 11 -Z-X 12 

wherein, 

X 11 and X 12 are complementary oligonucleotides; and 

Z is a probe having sufficient length such that X 1 1 and X 12 form a double-stranded portion of said member 
and thereby restrict the conformations available to said probe. 

24. A synthetic library in accordance with claim 23, wherein each of said probes is a peptide having of from about 4 to 
about 12 amino acids and optionally each member further comprises an intercalating dye. 

25. A method of synthesizing a library of conformationally-restricted probes on a solid support having optional spacers, 
said support comprising a surface with a plurality of preselected regions, said method comprising: 

(a) forming on each of said preselected regions a first oligonucleotide, each of said first oligonucleotides com- 
prising of from 6 to 30 monomers; 

(b) attaching to the distal end of each of said first oligonucleotides of step 
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(a) a probe; and 

(c) forming on the distal end of each of said probes a second oligonucleotide, wherein each of said second 
oligonucleotides is complementary to said first oligonucleotide which is attached within the same preselected 
region, and wherein said probes have sufficient length such that said first and second oligonucleotides form a 
unimolecular, double-stranded oligonucleotide thereby conformationaliy-restricting said probes. 

26. A method in accordance with claim 19 or claim 25, wherein said method of construction of step (a) and step (b) is 
by light-directed synthesis. 

27. A method of screening a sample for a species capable of binding to a conformationally-restricted probe comprising: 

contacting said sample with a solid support comprising conformationally-restricted probes attached thereon, 
each of said attached probes independently having the formula; 

-X 11 -Z-X 12 

wherein, 

X 11 and X 12 are complementary oligonucleotides; 
and Z is a probe having sufficient length such that X 11 and X 12 form a double-stranded oligonucleotide portion of 
said conformationally-restricted probe, to produce at least one bound pair comprising said species and one of said 
attached conformationally-restricted probes; and 

identifying said bound pair. 

28. An adhesive for use in biological applications comprising a first surface having a plurality of attached oligonucleotides 
and a second surface having a plurality of attached oligonucleotides, wherein the oligonucleotides of said first surface 
are substantially complementary to the oligonucleotides of said second surface. 

29. A synthetic intermolecular, doubly-anchored, double-stranded oligonucleotide library comprising a plurality of dif- 
ferent members, each member having the formula: 




wherein, 

Y is a solid support; 

X 1 and X 2 are a pair of complementary oligonucleotides; 
L 1 and L 2 are each independently a bond or a spacer. 

30. A library in accordance with claim 29, wherein L 1 and L 2 each independently comprise a member selected from the 
group consisting of an alkylene group, a polyethyleneglycol group, a polyalcohol group, a polyamine group and a 
polyester group, preferably L 1 and L 2 each independently comprise a polyethylene glycol group. 

31. A library in accordance with claim 29 or claim 30, wherein X 1 and X 2 are complementary oligonucleotides each 
comprising of from 6 to 30 nucleic acid monomers. 

32. Alibraryinaccordance with any one of claims 29 to 31, wherein said solid support is a silica support and L 1 comprises 
an aminoalkylsilane and from 1 to 4 hexaethyleneglycols. 

33. A method of preparing a single-stranded nucleic acid sequence, said method comprising: 

(a) forming a hybrid complex by combining at least two oligonucleotides which are phosphorylated at their 5' 
ends with a chip-bound oligonucleotide, said chip-bound oligonucleotide having subsequences which are com- 
plementary to a subsequence of each of said oligonucleotide; 
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(b) contacting said hybrid complex with a ligase to form a ligated oligonucleotide; and 

(c) releasing said ligated oligonucleotide from said chip-bound oligonucleotide to form a single-stranded nucleic 
acid sequence. 
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